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Preface 



Relationships amongst propositions are crucial pieces of knowledge. They 
express causal or plausible connections, bring isolated facts together, and 
help us obtain a coherent image of the world. Such relationships may be 
represented in a most general form by if-then- conditionals. 

Conditionals are omnipresent, in everyday life as well as in scientific envi- 
ronments. We make use of conditional knowledge when we avoid puddles on 
sidewalks (being aware of “If you step into a puddle, then your feet might get 
wet”) and when we expect high wheat prices from observing cold and rainy 
weather in spring and summer (due to “If the growing weather is poor then 
there will be an increase in the price of wheat” ) . Conditionals represent gene- 
ric knowledge, acquired inductively from experience or learned from books. 
They tie a flexible and highly interrelated network of connections along which 
reasoning is possible and which can be applied to different situations. 

Therefore, conditionals are most important, but also quite problematic 
objects in knowledge representation. They are not simply “true” or “false”, 
like classical logical entities. In a particular situation, a conditional is appli- 
cable (you actually step into a puddle) or not (you simply walk around), it 
can be found confirmed (you step into a puddle and indeed, your feet get wet) 
or violated (you step into a puddle, but your feet remain dry because you 
are wearing rain boots). So the central problem in representing and mode- 
ling conditional knowledge is to handle adequately, on the one hand, inactive 
(or neutral, respectively) behavior, and, on the other hand, active as well as 
polarizing behavior. 

This book presents a new approach to conditionals which captures this 
dynamic, non-propositional nature of conditionals peculiarly well. Conditio- 
nals are considered as agents shifting possible worlds in order to establish 
relationships and beliefs. This understanding of conditionals yields a rich me- 
thodological theory, which makes complex interactions between conditionals 
transparent and operational. Moreover, it provides a unifying and enhanced 
framework for knowledge representation, nonmonotonic reasoning, and belief 
revision, and even for knowledge discovery. In separating structural from nu- 
merical aspects, the basic techniques for conditionals introduced in this book 
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are applied both in a qualitative and in a numerical setting, elaborating fun- 
damental lines of reasoning. 

The novel theory of conditionals is at the heart of this work, from which 
its other major topics - revising epistemic states, probabilistic and nonmono- 
tonic reasoning, and knowledge discovery - are developed. So central concerns 
of Artificial Intelligence research are dealt with in a uniform and homogeneous 
way by investigating structures of conditional knowledge. Such structures are 
substantial, for instance, in abductive as well as in predictive reasoning, or 
for simulation tasks. 

Several persons contributed to the making of this book which is a revised 
version of my habilitation thesis at the FernUniversitat Hagen, Department 
of Computer Science. In the first place, I would like to thank Christoph 
Beierle for accompanying this work with his criticism and his support, and 
for refereeing the thesis. I am also very grateful to the other referees, Gerhard 
Brewka and Dov Gabbay, and to Wilhelm Rodder who infected me with 
his enthusiasm for probabilistic conditionals and the principle of maximum 
entropy. 

Thanks to Jeff Paris, Gerhard Brewka, and Karl Schlechta for discussing 
and sharing new ideas with me. Special thanks to Jeff for improving my 
English. 

Parts of the results presented in this book were obtained while I was sup- 
ported by a Lise-Meitner-scholarship, Department of Science and Research, 
North-Rhine-Westfalia, Germany. 

I dedicate this book to my husband, Klaus, for encouraging me all the 
time, and to my children Silja, Maj-Britt, and Malte, for their creativity. 
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1. Introduction 



Conditionals are closely connected with reasoning - typically, they suggest 
plausible, yet often defeasible conclusions from what is evident. Therefore, 
studying conditionals means primarily to overcome the strict framework of 
classical logic and to enter into the world of defeasible reasoning, nonmono- 
tonic logics and uncertain knowledge. 



1.1 “Believe It or Not” — 

The Handling of Uncertain Knowledge 

From the beginnings of Artificial Intelligence, commonsense and expert-like 
reasoning has been modelled in two basically different ways: The quantitative 
approaches using certainty factors [BS84, Voo96b], fuzzy logic [Zad83, Yag85, 
KGK93], belief functions [Dem67, Sha76, Sha86] , possibilities [DP92, DLP94], 
and probabilities [DeF74, Pea88, Bac90, Par94], and the symbolic approa- 
ches like circumscription [McC80], autoepistemic logic [M0088], default logic 
[Rei80] and many others (the references given are only examples). While the 
former methods have proved to be successful in practical reasoning, the lat- 
ter have helped to reveal and model structures of uncertain reasoning in a 
qualitative way. 

Probability theory here occupies an outstanding position: Devised and 
developed to perform sound reasoning in a quantitative setting, it not only 
became an important benchmark in the area of quantified reasoning in gene- 
ral, but also provides a semantics for qualitative default reasoning by conside- 
ring infinitesimal probabilities (see [Ada75, Pea89]) or orders of magnitudes 
of probabilities (see [GP96]). Within the last decades, knowledge represen- 
tation and reasoning based upon probability theory has received increasing 
attention in the area of artificial intelligence. Probability theory provides 
a solid foundation for nonmonotonic reasoning methods ([Ada75, Bac90, 
Gef92, G0I94, Pea88]), and probabilistic networks allow a consistent com- 
putation of (quantified) uncertainty ([Pea86, LS88, RKI97b, RM96]). Pro- 
babilities are particularly appropriate to quantify conditional statements “// 

G. Kern-Isberner: Conditionals in NMR and Belief Revision, LNCS 2087, pp. 1-10, 2001. 
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A then 5” which are of major interest in the areas of nonmonotonic reaso- 
ning and belief revision dealing with the dynamics of belief (see, for instance, 
[NutSO, Cox46, Cal91, DP91b, DP97a, LM92]). 

The property of monotonicity is undoubtedly crucial for classical deduc- 
tion: Adding new facts does not invalidate previously derived conclusions, 
so the set of conclusions may only increase monotonically. Thus it establis- 
hes a very solid fundament for exact sciences like mathematics. From daily 
experiences, however, we know that monotonicity is not an appropriate gui- 
deline for human reasoning: New information often makes us revising our 
beliefs, i.e. some beliefs which turn out to be false are given up, and other 
beliefs supported by the new information are accepted. Due to the fact that 
most of our knowledge is uncertain or incomplete, this process is so typical 
and so successful in our everyday lives that the then harsh debates attacking 
nonmonotonicity as detrimental to logics (cf. [Isr87]) appear quite whimsical 
nowadays. Just to the contrary - Dubois and Prade [DP96] state nonmono- 
tonicity as one of the most salient features an exception-tolerant inference 
system has to possess. 

Nevertheless, nonmonotonic reasoning is still a challenge - which beliefs 
are to be given up, which to be established? In their early paper [MD80], 
McDermott and Doyle tried to specify the scope of their “Nonmonotonic 
logic” between arbitrariness and rigidity: 

The purpose of non-monotonic inference rules is not to add certain 
knowledge where there is none, but rather to guide the selection of 
tentatively held beliefs in the hope that fruitful investigations and 
good guesses will result. This means that one should not a priori 
expect non-monotonic rules to derive valid conclusions independent 
of the monotonic rules. Rather one should expect to be led to a set of 
beliefs which while perhaps eventually shown incorrect will meanwhile 
coherently guide investigations. 

Belief revision, on the other hand, deals with the dynamics of belief — 
how should currently held beliefs be modified in the light of new informa- 
tion? Results in this area are mainly influenced by the so-called AGM theory, 
named after Alchourron, Gardenfors and Makinson who set up a framework 
of postulates for a reasonable change of beliefs ([AGM85, Gar88]). 

This book exploits the crucial relationship between plausible uncertain 
reasoning and conditionals to obtain a unified and enhanced framework for 
studying nonmonotonic reasoning and belief revision: We will investigate how 
to revise epistemic states by (sets of) conditionals in different settings, inclu- 
ding a purely qualitative, a probabilistic, and an intermediate environment 
using ordinal conditional functions for representation. Epistemic states repre- 




1.1 “Believe It or Not” - The Handling of Uncertain Knowledge 



3 



sent the cognitive state of an intelligent agent at a given time. They permit us 
to graduate beliefs according to their plausibility and thus allow a more ap- 
propriate studying of belief change than plain propositional belief sets which 
are the objects of interest in AGM theory. While AGM theory only observes 
the results of revisions, considering epistemic states under change focuses on 
the mechanisms underlying that change, taking conditional beliefs as revision 
policies explicitly into account. So the work presented here meets a crucial 
demand raised in Friedman and Halpern’s Critique [FH99] to AGM revision: 
“. . . whatever we take to be our representation of the epistemic state, it seems 
appropriate to consider how these representations should be revisedP 

The idea to consider conditionals, nonmonotonic reasoning and belief re- 
vision from a common point of view is not new. Indeed, the crucial role of 
“conditional objects” has been recognized for many years (see [DP91a, KS91, 
Bou94, FH94]). In this book, however, conditionals are not considered as logi- 
cal entities, but as dynamic agents shifting worlds in order to establish beliefs. 
This understanding of conditionals has far-reaching consequences and yields a 
theory which is quite different from the ones raised by logical considerations. 

In separating structural from numerical aspects when handling conditio- 
nals, the basic notions for conditionals developed here may be applied to yield 
important results both in a qualitative and in a numerical setting. Indeed, 
conditionals are at the heart of this book from which the other major topics 
- revising epistemic states, extended nonmonotonic reasoning and knowledge 
discovery - will be developed. Conditional valuation functions will be intro- 
duced as abstract (numerical) representations of epistemic states covering 
probability functions, ordinal conditional functions and possibility distributi- 
ons. The notion of a conditional structure defined for (multi-)sets of possible 
worlds allows us to formalize correctly the idea of indifference of conditional 
valuation functions with respect to sets of conditionals, resulting in the state- 
ment of a principle of conditional preservation for revisions. Within a purely 
qualitative environment, we set up postulates describing what it means for a 
revision to preserve conditional beliefs. Moreover, we show that the (numeri- 
cal) principle of conditional preservation actually generalizes these postulates. 
Thus a thorough axiomatization of this principle is obtained which consti- 
tutes an important paradigm when revising epistemic states, similar to the 
paradigm of minimal propositional change guiding AGM-re visions. 

Besides conditionals, a second focus of this book is on probabilistic rea- 
soning at optimum entropy where the principles of maximum and minimum 
entropy {ME-principles), respectively, will be used to represent incomplete 
probabilistic knowledge in an information-theoretically sound way. It can 
easily be seen that each ME-revision satisfies the principle of conditional pre- 
servation and therefore fits the formal framework sketched above. We will 
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investigate what other postulates are necessary to characterize ME-revisions 
as a “best” probabilistic revision function, identifying an appropriate fun- 
ctional concept and the properties of logical coherence and representation 
invariance as responsible for the special form of ME-revisions. This establis- 
hes reasoning at optimum entropy as a most fundamental inference method 
when using probabilistic conditionals to represent knowledge. 

Integrating ME-inference and ME-revision, respectively, into the frame- 
works of nonmonotonic reasoning and belief revision turns out to be particu- 
larly fascinating and fruitful: On the one hand, the abstract properties of non- 
monotonic inference operations like cumulativity, left logical equivalence etc. 
([Gab85, KLM90, Mak94]) and the axioms for belief revision ([Gar88, DP94]) 
help to classify the ME-techniques from a formal point of view, thereby raising 
the reputation of this powerful, but sometimes seemingly obscure method. On 
the other hand, however, studying ME-inference may give important impetus 
to the field of nonmonotonic logics and belief revision in general. For a long 
time, both areas have been concentrating on handling only propositional be- 
liefs in a one-step manner, without basing inferences explicitly on a theory 
and without considering iterated revisions. In contrast to this, the principles 
of optimum entropy provide a comprehensive frame to realize iterated revi- 
sions of (probabilistic) epistemic states by sets of conditionals, thus genera- 
lizing classical AGM-revision in nearly all aspects. And indeed, the property 
of logical coherence, mentioned already in [SJ81] and here used as one of 
the postulates to characterize ME-revision, may be read as a set-theoretical 
version of Darwiche and Pearl’s axiom (Gl) for iterated revision ([DP97a]). 

The ax;iom of logical coherence may be formulated easily for general epi- 
stemic states, and we interpret it as a kind of ^^cumulativity with respect to 
epistemic states'" phrased for universal inference operations. Universal infe- 
rence operations provide a global setting to study inference and revision, and 
logical coherence proves to be an important means to link up inferences based 
on different epistemic states. Gonsidering revision operators in the enhanced 
framework of revising epistemic states by sets of conditionals allows us to 
differentiate between simultaneous and successive revision, and to separate 
background from evidential knowledge. This permits us to distinguish more 
clearly between different belief change operations like (genuine) revision, up- 
dating and focusing which may be realized, however, by the same (binary) 
revision operator. 

Furthermore, the principle of conditional preservation presented here was 
first developed for ME-revision in [KI98a], but it can also be applied in a 
more general framework. The conditionals to be incorporated impose a spe- 
cific structure on the resulting probability distribution. This structure may 
be used to elaborate sets of probabilistic conditionals that represent a given 
distribution {inverse representation problem). This is particularly important 
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when one wants to extract probabilistic knowledge from statistics in order 
to present relevant connections between the considered variables {knowledge 
discovery /data mining) and/or to design a probabilistic network for know- 
ledge representation and reasoning. The first steps in this direction will be 
undertaken in this book. 

The principal concern of this book is to develop a common approach to 
symbolic and numerical uncertain reasoning via conditionals. Most of the to- 
pics to be treated here, however, were actually addressed when working on 
a concrete computational system, namely the expert system shell SPIRIT 
realizing maximum entropy propagation ([RKI97b, RKI97a, RM96]). Here 
the following questions arose: Besides respecting (conditional) independence, 
what are the mechanisms underlying ME-techniques? How can ME-inference 
and ME-adaptation be compared to other methods? Conditionals are ge- 
nerally considered to be very important for knowledge representation and 
reasoning, but how can their meaning and effects be made explicit? And, last 
not least, a crucial problem in designing expert systems: Where do all the 
conditionals representing substantial knowledge come from? How should we 
use experimental data? 

This book aims at answering all these questions by presenting a general 
framework for nonmonotonic reasoning and belief revision that features con- 
ditionals and ME-methods as particularly meaningful both to qualitative and 
quantitative approaches. 



1.2 Overview 

The organization of this paper is as follows: Fixing basic definitions and 
notations in the next section will conclude this introduction. 

Chapter 2 outlines the state of the art of belief revision and nonmonoto- 
nic reasoning. Several properties of nonmonotonic inference operations which 
will be used in this book are listed here. In the area of belief revision, the 
standard AGM-theory dealing with expansion, revision and contraction of 
propositional beliefs is recalled, and we explain the difference between revi- 
sion and updating in the sense of Katsuno and Mendelzon. Then we discuss 
how to extend this framework of propositional belief change by studying 
epistemic states and conditionals, allowing us to perform iterative revisions. 
Finally, we present a picture of belief revision and nonmonotonic reasoning 
from a probabilistic point of view, featuring revisions and inferences based on 
the principles of optimum entropy as particularly sophisticated and powerful 
methods. 
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Chapter 3 focuses on conditionals and starts with studying the connec- 
tions between conditionals and epistemic states. Conditional valuation fun- 
ctions are introduced, and we discuss how the acceptance of (conditional) 
beliefs in epistemic states may be modelled by these functions. Then we turn 
to more formal things: In Section 3.4, we define two relations, subconditio- 
nality, C, and perpendicularity, _1L, on conditionals, describing quite extreme 
ways of conditionals interacting with one another. These relations will prove 
to be useful especially in a qualitative setting of belief change and in the 
field of (conditional) knowledge discovery. An even more important notion is 
introduced in Section 3.5: We represent conditionals by generators of a (free- 
abelian) group and define the conditional structure of a world. Using group 
theoretical structures makes it possible to calculate with conditionals, or with 
their effects on worlds, respectively. These formal means provide a framework 
adequate to phrase exactly, what conditional indifference of a conditional va- 
luation function with respect to a set of conditionals means (see Section 3.6). 
We show that probability functions and ordinal conditional functions which 
are indifferent with respect to some set of conditionals follow quite a simple 
conditional-logical pattern. 

Conditional indifference will prove to be of crucial importance when re- 
vising epistemic states by conditional beliefs in Chapter 4 in that it is the 
essential ingredient to formalizing a quantitative principle of conditional pre- 
servation for revising conditional valuation functions in Section 4.5. Revisions 
by sets of conditionals and representations of sets of conditionals will be cal- 
led c-revisions and c-representations, respectively. But first, we will describe 
what it means to preserve conditional beliefs in a purely qualitative setting 
by stating postulates for revising an epistemic state by a conditional in Sec- 
tion 4.1. Representation theorems for these postulates will be given in Section 
4.2, and they will be exemplified by presenting a revision operator for ordinal 
conditional functions in Section 4.4. We investigate the meaning of conditio- 
nal valuation functions for qualitative revisions in Section 4.3, and in Section 
4.5, we show that both approaches to the principle of conditional preservation 
developed here, the qualitative one and the quantitative one, are compatible. 

The idea of revisions obeying the principle of conditional preservation 
is pursued further in a probabilistic environment in Chapter 5. We elabo- 
rate three more postulates such a revision should satisfy: First, a functional 
concept should establish a clear and unique connection between prior know- 
ledge, new information and the revised probability distribution (see Section 
5.2). Second, we present the postulate for logical coherence in Section 5.3. 
This postulate claims that revised probability distributions can be used un- 
ambigously as priors for further revisions and thus is of crucial importance 
particularly for iterated revisions. Finally, the postulate for representation 
invariance states that revisions should not depend on the syntactical re- 
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presentation of probabilistic knowledge (cf. Section 5.4). Following these four 
postulates, we arrive at a new characterization of revisions based on the prin- 
ciple of minimum cross-entropy (ME-revisions) (cf. Theorem 5.5.1 in Section 
5.5). 

We investigate how ME-reasoning works in Chapter 6. First we check 
which of the properties relevant to nonmontononic inference operations are 
satisfied by ME-inference. In particular, we show that ME-inference is cumu- 
lative and fulfills the loop-property (cf. Section 6.2). Then we present some 
ME-deduction schemes in Section 6.4 to illustrate ME-reasoning in simple, 
but typical and informative situations such as transitive chaining, cautious 
monotonicity and reasoning by cases. 

In Chapter 7, we return to general belief revision and nonmonotonic reaso- 
ning. Due to the formal manner in which numerical ME-inference is handled 
in this book, it is possible to transfer some crucial insights provided by this 
powerful inference operation to the general theory. ME-reasoning exemplifies 
effectively how a more comprehensive and unified view on this area is opened 
by considering revisions and inferences in an extended framework using epi- 
stemic states and conditionals. Universal inference operations are introduced 
in Section 7.1 as a proper counterpart to revision operators in nonmonotonic 
reasoning, allowing us to take a basic epistemic state into account. We show 
how to distinguish between simultaneous and successive revision (cf. Section 
7.2), and how to separate clearly between background and evidential know- 
ledge (cf. Section 7.3). This allows us in particular to overcome the conceptual 
difference between (genuine) revision (in the AGM-sense) and updating (in 
the sense of Katsuno and Mendelzon) by considering them not as different 
change operators, but as applying the same change operator in different ways 
(cf. Sections 7.4 and 7.5). Moreover, focusing may be realized as different from 
revision by the ME-revision operator (cf. Section 7.6). Iterated revisions may 
be dealt with adequately in that framework, too. The postulate for logical 
coherence used for ME-characterization proves to be of crucial importance to 
control iterated revision and to link up inference operations. 

Chapter 8 brings a brief sketch of some results in probabilistic knowledge 
discovery and then turns to its main part, the discovery of structures of know- 
ledge by following conditional patterns within conditional valuation functions. 
Revisions of such functions which obey the principle of conditional preserva- 
tion are necessarily indifferent with respect to the revising set of conditionals. 
So discovering “conditional structures” means in particular finding a set of 
conditionals with respect to which the given conditional valuation function, 
e.g. a probability distribution, is indifferent. In Section 8.2, we develop an 
approach to accomplish this task by using the group theoretical representati- 
ons of conditionals, developed in Chapter 3. Part of an algorithm is presented 
that allows us to calculate such a set of conditionals by studying numerical 
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relationships between the values given. This method is applied in Section 8.3 
to illustrate how an ME-optimal representation of a probability distribution 
by a set of conditionals can be computed. 

Chapter 9 presents briefly a selection of various computational approaches 
to ME-reasoning with probabilistic conditionals, to probabilistic knowledge di- 
scovery and to possibilistic belief revision. 

Finally, Chapter 10 summarizes the results of this book. 

Preliminary versions of various parts of this book have already been pu- 
blished in [RKI93, KIR96, KI96b, KI96a, KI97a, KI97c, KI97b, KI98c, KI98a, 
RKI97b, RKI97a, KIOl, KI98b, KI99c, KI99b], and in [KI99a]. 

In order to improve the readability of the text, the full proofs of lemmata, 
propositions, corollaries and theorems have been moved to Appendix A. 



1.3 Basic Definitions and Notations 

1.3.1 Propositional and Conditional Expressions 

We consider a propositional language C = Ciy) over a finite alphabet 
V = {a, b,c,.. .}. Uppercase roman letters A,B,C . . . will denote atoms or 
formulas in C. C is equipped with the usual logical connectives A {and), V 
(or) and -i {negation). We will largely avoid material implication in order not 
to get confused with conditional implication (see below). To simplify notati- 
ons, we will replace a conjunction by juxtaposition and indicate the negation 
of a proposition by barring it, i.e. 

AB = A A B and A = -•A 

A will denote one of the formulas A, A. Elementary conjunctions are con- 
junctions of literals, i.e. of positive or negated atoms. Complete conjunctions 
are elementary conjunctions which contain each atom either in positive or 
negated form. Tautologies and contradictions will be denoted by T and T, 
respectively. 

Let n denote the set of possible worlds, i.e. 17 is a complete set of inter- 
pretations of C. Two worlds w, w' G 17 are called neighbors if they differ with 
respect to exactly one atom. 

Given a propositional formula A G C, we denote by Mod {A) the set of all 
A- worlds, 



Mod {A) = {u} G C \ Lo \= A} 
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Definition 1.3.1. For a set of worlds {wi, W 2 , . . .} C f2, we define 



form{uJi,uj2, ■■■) & C 

to be that proposition in C which has coi,cij 2 , ... as its models: 

Mod {form {uji,uj2, •■■)) = {wi,W2, ■ • ■} 

Sometimes, worlds will be identified simply with their corresponding com- 
plete conjuntion 

UJ= f\ V ( 1 . 1 ) 

v:lj\^v 

where the conjunction is taken over all atoms v in C. 

li A, B G C are two propositional formulas in £, then A ^ B iS A \= B, i.e. 
iff Mod {A) C Mod{B). = means classical logical equivalence, that is A = B 
iff Mod{A) = Mod{B). 

C is extended to a conditional language (£ | L) by introducing a condi- 
tional operator |: 

{C\C) = {{B\A)\A,BgC} 

A is called the antecedent or the premise of {B\A), and B is the conse- 
quence of the conditional {B\A). (£ | L) is taken to include C by identifying 
a proposition A with the conditional (^|T). 



1.3.2 Probabilistic Logics 

The atoms in C may be looked upon as (binary) propositional variables, and 
possible worlds or complete conjuntions, respectively, correspond to elemen- 
tary events. So, given a probability distribution P over V, a probability can 
be assigned to each propositional formula A G £ (V) via 

P{A) = ^ P{io) 

lj\^A 

In this way, a probabilistic interpretation of C is obtained. 

We extend C (V) to a probabilistic conditional language (£ | L) by 
attaching a probability x G [0, 1] to each conditional {B\A) G (£ | £): 

(£ I £)•••• = {{B\A) [x] I {B\A) G (£ I £),x G [0,1]} 

{B\A) [x] is called a probabilistic conditional or sometimes a probabilistic 
rule, too. It is to represent syntactically non-classical conditional assertions 
{B\A) weighted with a degree of certainty x. Probabilistic conditionals are 
interpreted via conditional probabilities: If P is a distribution, we write 
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P \= (B\A) [x] iff P{A) > 0 and P{B\A) = = x 

P(Aj 

A probabilistic fact A[x] is regarded as equivalent to the probabilistic condi- 
tional (A|T)[x] with tautological antecedent, so P ^ A[x\ iff P{A) = x. 

In general, we have 



x = P{B\A) iff P(A) >Oand(l-x)P(AP) = xP(AP), 



so the quotient 



P{AB) 



determines the probability of the conditional {B\A). 



P{AB) 

It represents the proportion of individuals or objects with property A which 
also have property B to those that do not. Thus it is crucial for accepting 
the conditional, not only within a probabilistic framework (cf. [NutSO]). 



Probabilistic facts and conditionals will also be denoted by small Greek 
letters 4>, if etc. So for a distribution P and for a set P C (£ | £) of 
probabilistic conditionals, we write P ^ P iff P |= ^ for all (/) G P. Models 
of probabilistic conditionals are probability distributions that fulfill them, 
hence 

Mod{TZ) = {Q \ Q distribution over V, <5 |= P} 



for P C (£ I C) ,£ = £(V). A set of probabilistic conditionals P C 
(£ I C) is consistent iff it has a probabilistic model, i.e. iff there is a 
distribution Q such that Q \= TZ. Two sets Pi,P 2 C (£ | £) are probabi- 
listically equivalent iff Mod{TZ\) = Mod{TZ 2 )- 
For a distribution P over V, let 



Th{P) = {(P|A) [x] G (£ I £)•"• I P h {B\A) [x]} 

denote the set of all probabilistic conditionals which are valid in P. Th{P) 
explicitly represents the conditional knowledge embodied in P. 

Two distributions Pi,P 2 are identical iff Pi(w) = P 2 (w) for all w G 17, 
that is. Pi 1= w[x] iff P 2 \= oj[x\. So there is a one-to-one correspondence 
between distributions P and their theories Th{P). 




2. Belief Revision and Nonmonotonic 
Reasoning — State of the Art 



The capability of revising knowledge and giving up conclusions in the light of 
conflicting evidence is one of the most outstanding features of commonsense 
reasoning. Though it seems to be practised in everyday life in a most natural 
and self-evident way, it challenges knowledge representation and inference 
procedures in AI because it clashes with the classical property of monotoni- 
city. Therefore defeasible or nonmonotonic reasoning, as it is usually called, 
requires new formalisms to be realized adequately, and to date, a number 
of approaches to “nonmonotonic logic” have been proposed. Makinson and 
others ([Gab85, Mak94, KLM90]) set forth formal properties and axiom sy- 
stems to judge and classify inference relations lacking monotonicity. Makin- 
son’s work also covers quite general inference procedures not being based on 
classical structures. 

The topic of belief revision is to investigate knowledge bases in change. 
The great variety of approaches that have been proposed to date, usually 
each method coming along with a descriptive axiom scheme (for a survey, 
cf. [GR94]), corresponds to the many different interpretations and names 
the term change has been given. Gardenfors [Gar88] identified three funda- 
mental types of belief change, revision, expansion and update. Katsuno and 
Mendelzon [KM91b] recommend updating to handle knowledge in a changing 
world. Conditioning has been regarded as an adequate method for revising 
probabilistic beliefs (see, for instance, [Par94, Gar88]), but Dubois and Prade 
[DP 97b] emphasize that actually, conditioning does not correspond to revi- 
sion but rather to focusing. 

Nonmonotonic reasoning and belief revision are closely related but have 
different focusses: While studying and realizing nonmonotonic inference re- 
lations is the principal topic of the first, the latter is mainly concerned with 
investigating the resulting changes in the belief sets (or belief states). 



G. Kern-Isberner: Conditionals in NMR and Belief Revision, LNCS 2087, pp. 1 1-26, 2001 . 
© Springer- Verlag Berlin Heidelberg 2001 
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2.1 Nonmonotonic Reasoning 

A number of different methods to realize nonmonotonic reasoning have been 
proposed up to now, among others Reiter’s default logic [ReiSO, Ant97] and 
its variants ([Bre94, Ant97]), circumscription [McCSO], autoepistemic logic 
[MD80, Moo88], inheritance diagrams [Tou86], maxiconsistent sets [Poo88], 
preferential models [KLM90] , and, using quantified uncertainty, probabilistic, 
possibilistic and fuzzy logics (cf. [Pea88, DLP94, KGK93]); for a survey, see 
[Som92] or [GHR94]. In spite of the diversity of representation forms and 
inference techniques used it is possible to compare different nonmonotonic 
logics by focussing on the inference relations induced. This idea goes back to 
Gabbay [Gab85] and was later pursued and elaborated by Kraus, Lehmann, 
Magidor [KLM90] and Makinson [Mak89, Mak94]. A number of important 
properties to describe reasonable nonmonotonic inference relations have been 
identified, e.g. cumulativity, loop and rational monotonicity. In the sequel, we 
will give an overview so as to cover the scope of this book. 

Let L* be any language used to represent relevant knowledge appropria- 
tely, e.g. in this book, C* may be one of £, (£ | C) or (£ | C) .If is 
a (nonmonotonic) inference relation between sets of formulas and single 
formulas of C* , 

C 2^* X £*, 

then the corresponding inference operation C is defined via 

C{A) = {<i)&C* \A'^(f} 

for sets of formulas A C £*. Gonversely, each inference operation C : 2^ 

2^ induces an inference relation by setting A\^ (j) iS. (f G C(A). We 
will use both notations simultaneously, depending on which appears more 
intuitive. In generalizing the inference relation from single formulas on its 
right side to sets of formulas, we will write for C C(A). 

Let A,B be sets of formulas of £*,(/) be a single formula. A (nonmono- 
tonic) inference operation C is called reflexive if A C C{A), and idempotent 
if C{C{A)) = C{A). C is said to satisfy cut if A C B C C{A) implies 
C{B) C C{A), and fulfills cautious monotonicity if A Q B C C(A) implies 
C{A) C C{B). Gut and cautious monotonicity together yield the property of 
cumulativity 

ACBCC{A) implies C{A) = C{B) (2.1) 

guaranteeing a convenient stability of inference: Taking already inferred kno- 
wledge into account does not change inferences. A cumulative inference ope- 
ration is assumed to satisfy refiexivity (synonym: inclusion) besides cumula- 
tivity. So a cumulative inference operation also fulfills the condition of reci- 
procity. If A C C{B) and B C C{A) then C{A) = C{B) (cf. [Mak94]). 
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Compared to a classical deduction operation, a cumulative inference ope- 
ration only differs with respect to monotonicity: Instead of full monotonicity, 
i.e. A C B implies C{A) C C{B), we have cautious monotonicity. A classi- 
cal inference operation satisfying inclusion, monotonicity and cut is called a 
consequence operation and is denoted by Cn. 

If C* is a propositional language, C* = C, then a (nonmonotonic) inference 
operation C can be linked to a classical deduction operation Cn via the 
property of supraclassicality, i.e. Cn{A) C C(A) for all AQ L\ Any formula, 
that can be deduced from A classically, can also be derived nonmonotonically. 
Other conditions connecting nonmonotonic inference operations to classical 
logic are the following: 

— left absorption: CnC = C; 

— full absorption: CnC = C = C Cn; 

— right weakening: If ^ G Cl(A) and ip G Cn{4>) then ip G C{A); 

— left logical equivalence: Cn{A) = Cn{B) implies C(A) = C{B); 

— distribution: C{A) O C{B) C C{Cn{A) O Cn{B)); 

— conditionalization: li ip & C{AC p) then p ^ ip G C(A) (where means 
material implication) . 

Two more properties are of relevance when studying the behavior of nonmo- 
notonic inference relations: 

An inference relation (or operation C, respectively) is said to satisfy 
loop if 

whenever |~ A 2 |~ . . - An h then C{Ai) = C{Aj) fori,j < n (2.2) 

The loop-property is a weakened form of transitivity which typically does not 
hold for nonmonotonic inferences. 

Finally, we recall the non-Horn condition of rational monotonicity (cf. 
[LM92]): 

li A'CP and not A\^^p then A U {p} p (2.3) 

Rational monotonicity is quite a strong condition for nonmonotonic logics, 
assuming anything as irrelevant to the inference A\^p the negation of which 
cannot be inferred from A. 



2.2 Belief Revision 

In general, belief revision means the process of adapting some set of beliefs, 
or some accepted knowledge, respectively, to new information. Belief revi- 
sion has many facets, depending on the compatibility of old and new beliefs, 
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whether belief is added or given up, or if only generic knowledge is applied 
to evidential information. Following [Gar88, AGM85] we will sketch the prin- 
cipal types of change for propositional beliefs - expansion, revision and con- 
traction - and the corresponding catalogues of postulates, each describing a 
reasonable change of beliefs; these basic postulates are known today as the 
AGM-theory, named after Alchourron, Gdrdenfors and Makinson. 

Let K he a, propositional belief set, that is, AT C £ is a set of propositions 
which is closed under classical consequence Gn. Let A be some proposition 
representing the newly acquired information which K is to be revised by. 

The simplest type of revision occurs if A is consistent with K, i.e. if A 
does not contradict any of the beliefs in K. This type of revision is called 
expansion, denoted by -I-. Gardenfors [Gar88] lists intuitive postulates for an 
expansion which are apt to characterize expansion uniquely within a classical 
logical framework: 

AGM-postulates for expansion: 

(AGM +1) A -I- A is a belief set. 

(AGM +2) AgK + A. 

(AGM +3) K CK + A. 

(AGM +4) If A G A then A -h A = A. 

(AGM +5) If A C A then K + A C H + A. 

(AGM +6) A + A is the smallest belief set satisfying (AGM +1) - (AGM +5). 

Theorem 2.2.1 ([Gar88]). The expansion function + satisfies (AGM +1) 
- (AGM +6) iffK + A = Cn{K\J{A}). 

Belief revision in general (operator: *) does not presuppose the consistency 
of the new information A with the belief set A; if consistency holds, however, 
revision reduces to expansion: 

AGM-postulates for revision: 

(AGM *1) A * A is a belief set. 

(AGM *2) A G A* A. 

(AGM *3) K * AC K + A. 

(AGM *4) If ^A ^ A then K + AC K * A. 

(AGM *5) A * A is inconsistent iff A is contradictory. 

(AGM *6) If A and B are logically equivalent, then K * A = K * B. 
(AGM *7) K*AaBC{K*A) + B. 

(AGM *8) If ^ A * A then {K * A) + B C K * A A B. 
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Unlike for expansion, these postulates are not sufficient to describe uni- 
quely one optimal revision operator; they only outline the scope of reasonable 
belief revision. Katsuno and Mendelzon [KM91a] rephrased the AGM-revision 
axioms more concisely for propositional logic (where K is supposed to be a 
single formula, too): 

(AGM’ *1) K * A implies A. 

(AGM’ *2) If A A A is satisfiable, then K * A = K A A. 

(AGM’ *3) If A is satisfiable, then K * A is also satisfiable. 

(AGM’ *4) If Ki = K 2 and Ai = Aa, then Ai * Ai = Aa * Aa. 

(AGM’ *5) (A * A) A B implies K * {A A B). 

(AGM’ *6) If (A*A)AB is satisfiable, then A*(AAB) implies (A*A)Ai?. 

The third important belief change operation presented in [Gar88] is con- 
traction (operator: — ) dealing with the mere deletion of beliefs. 

AGM-postulates for contraction: 

(AGM -1) A — A is a belief set. 

(AGM -2) A- A C A. 

(AGM -3) if A ^ A then A - A = A 
(AGM -4) if not h A then A ^ A — A. 

(AGM -5) if A G A then A C (A - A) -k A. 

(AGM -6) if A and B are logically equivalent, then K — A = K — B. 
(AGM -7) (A - A) n (A - A) C A - A A A. 

(AGM -8) if A ^ A - A A A then A-AAACA-A. 

Following Levi [Lev77], expansion and contraction are more fundamental 
than revision. As he sees it, revising by A means first contracting -lA and 
then adding consistently belief in A; this is formalized by the so-called Levi 
identity 

K*A={K-^A) + A (2.4) 

In this way, revision can be expressed by contraction and expansion. Gon- 
versely, a contraction operation in terms of revision is given by the so-called 
Harper identity 

K- A = K n (K*^A) (2.5) 

(cf. [Har77]). 

Another interesting aspect of belief change was considered by Katsuno 
and Mendelzon in [KM91b]. They argued that the AGM-type revision was 
only adequate to describe a revision of knowledge about a static world, but 
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not for recording changes in an evolving world. They called this latter type of 
change update (operator: o), with erasure being its inverse operation. Their 
axioms for belief update, stated below, presuppose the knowledge base K to 
be representable by a single propositional formula, here also denoted by K : 

KM-postulates for updating: 



(KM 


ol) 


(KM 


o 2 ) 


(KM 


o3) 


(KM 


o4) 


(KM 


o5) 


(KM 


06 ) 


(KM 


o7) 


(KM 


08 ) 



K o A implies A. 
liK'^ A then KoA = K. 

If K and A are satisfiable, then K o A is also satisfiable. 

If Ki = K2 and Ai = A2 then Ki<> Ai = K2 o A 2 . 

{K <> A) A B implies K o {A A B). 

If KoAi implies A 2 and K 0 A 2 implies Ai then KoAi = K<>A 2 - 
If K is complete then [K o Ai) A {K o A2) implies K o {Ai V A2). 
{Ki y K2 )oA = {Ki oA)y {K2 o A). 



For a thorough discussion of these postulates and a comparison to AGM- 
revision, cf. [KM91b]. 

An important representation result characterizing the AGM-revision was 
established in [KM91a]. It made use of faithful assignments which assigns 
to each propositional formula K a pre-order over the set of worlds (or 
interpretations, respectively) 17 such that the following three conditions are 
satisfied: 



1. If uj,oj' G Mod{K) then u <k w' does not hold; 

2. if w G Mod{K) and uj' ^ Mod{K) then u> <k w'; 

3. if Ki = K 2 then = ^K 2 'i 

where to <k ai' means ui oj' and not ui' w. 

Thus faithful assignments identify the models of K with the ^/^-minimal 
worlds without making any differences between these models. 

Theorem 2.2.2 (Representation theorem [KM91a]). The revision ope- 
rator * defines an AGM-revision in the sense of the postulates (AGM *1) - 
(AGM *8) iff there exists a faithful assignment that maps each propositional 
formula K to a total pre-order such that 

Mod{K * A) = mhi<^j^{Mod{A)) (2-6) 

A similar representation result for update was established in [KM91b]. 
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2.3 Nonmonotonic Reasoning and Belief Revision - 
Two Sides of the Same Coin? 

This question was raised and investigated by Gardenfors and Makinson in 
[MG91] and [Gar92]. The similarities between nonmonotonic reasoning and 
belief revision are evident - both fields are concerned with uncertain reaso- 
ning, and the relationship between them is prima facie very plausibly esta- 
blished by 

A[^B iff (2.7) 

where T is some (classical) theory. In particular, each belief revision operation 
gives rise to a nonmonotonic inference operation (or relation). Gardenfors 
and Makinson show in [MG91] how postulates from one field translate via 
(2.7) into properties of the other. But - although the relationship is close, 
it is not perfect, due to essentially different focusses: In belief revision, the 
theory T (cf. (2.7)) which is to be revised, is of central concern, whereas 
in nonmonotonic reasoning, it is not mentioned at all! This flaw may be 
partially remedied by considering the relation as describing inferences 
based on some fixed background knowledge T so that (2.7) now reads 

A\^j,B iff T*A\=B 

This idea of explicitly representing background knowledge, however, is not 
dealt with in nonmonotonic reasoning. So belief revision should be conside- 
red the more general approach, handling the full dynamics of belief more 
thoroughly. 



2.4 Iterated Revision, Epistemic States, and 
Conditionals 

Glassical belief revision takes belief sets, i.e. deductively closed sets of pro- 
positional formulas, or classical consequences of one propositional formula, 
respectively, as basic representations of knowledge. These sets, however, 
are only poor reflections of the complex attitudes an individual may hold. 
The limitation to propositional beliefs severely restricts the frame of the 
AGM theory, in particular, when iterated revision has to be performed (cf. 
[DP97a, Bou93, BG93, Sch91]). Instead of belief sets, epistemic states, 'P, 
should be considered as representations of the cognitive state of some intel- 
ligent agent at a given time.^ 

^ An interesting approach to iterated revisions of belief sets was proposed quite re- 
cently by Lehmann, Magidor and Schlechta in [LMSOl]. They base their revision 
operations upon a formal notion of distance. 
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Formally, epistemic states may be described in many different ways. A 
very simple representation of an epistemic state is obtained by focusing on 
the corresponding belief set, that is, on the propositions believed to be true 
in that epistemic state, and by means of simple conditional functions (cf. 
[Spo88]) recording its epistemological changes. Here the belief set is thought 
of as being represented by one propositional formula, the so-called net content 
of the epistemic state (cf. [Spo88]). Identifying propositions with subsets of 
the set fl of possible worlds, a simple conditional function / assigns to all 
non-empty subsets A C 17 a non-empty subset of 17 such that 

/(A) C A; (2.8) 

if /(A)nSyl0 then f{AnB) = f{A)DB. (2.9) 

/(A) is to be interpreted as the net content of the changed epistemic state 
when accepting the new information A to be true. Simple conditional func- 
tions correspond to partitions of the set of worlds in equally plausible sets 
of worlds (cf. [Spo88]) and thus to plausibility pre-orderings. The notion of 
simple conditional functions was generalized to ordinal conditional functions 
in [Spo88] to achieve a more adequate representation of epistemological atti- 
tudes. 

Furthermore, probability distributions are generally considered as a 
particularly sophisticated means for representing epistemic states ([Gar88, 
Spo88]). Another numerical approach to epistemic states is provided by pos- 
sibility distributions ([DP91c, DP94, DLP94]). 

The crucial difference between belief sets and epistemic states is that be- 
sides the set of propositional beliefs, BeZ(<F) C £, the individual accepts for 
certain, an epistemic state W also contains the revision policies the individual 
entertains at that time (cf. [Bou93, BG93]). These revision policies reflect the 
(propositional) beliefs, B, the individual is inclined to hold if new informa- 
tion, A, becomes obvious. They are adequately represented by conditionals 
(H|A), i.e. expressions of the form “// A then H”, conjoining two propositio- 
nal formulas A and B for a plausible conclusion. So the conditional {B\A) is 
accepted in the epistemic state iff revising by A yields belief in B. This 
defines a fundamental relationship between conditionals and the process of 
revision, known as the Ramsey test ([BG93, Gar88, Ram50]): 

If h (B|A) iff Bel{Wi.A)'^B (2.10) 

where * is a revision operator, taking an epistemic state W and some new 
belief A as inputs and yielding a revised epistemic state <F*A as output. So in 
the context of revision, a subjunctive meaning of conditionals fits particularly 
well, in accordance with the Ramsey test: If A were true, B would be believed, 
implicitly referring to a revision of the actual epistemic state by A. 
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Hence epistemic states are intimately related with belief revision as well 
as with conditionals, and so are the axioms of each of the three domains; 
e.g. the properties (2.8) and (2.9) stated above for simple conditional fun- 
ctions are essentially equivalent to the AGM-postulate (AGM *2) and to a 
conjunction of (AGM *7) and (AGM *8), respectively. This implies also a 
close connection between epistemic states and conditionals on one hand and 
nonmonotonic reasoning on the other. For instance, simple conditional fun- 
ctions may be regarded as so-called choice functions which Schlechta bases 
his Nonmonotonic Logics [Sch97] upon. Furthermore, it is easy to show that 
property (2.9) for simple conditional functions implies the condition 

if B CA then f{A) n H C f{B) 

which is crucial for characterizing minimal preferential structures (cf. [Sch97, 
p. 13]). The relationships between these different areas have been studied in 
several papers (see, for instance, [KS91, DP91a, Gra91j). 

Extending belief revision to an operation on epistemic states instead of 
belief sets opens up the framework to investigate iterated revision in a fully 
dynamic system of belief change. As a first step towards this aim, Darwi- 
che and Pearl generalized the AGM-revision postulates for revising epistemic 
states by conditional beliefs (cf. [DP97a]): 

AGM-Postulates for revising epistemic states [DP97a] 



Suppose <F, 1 F 2 to be epistemic states and A,Ai,A 2 ,B G £; 

(AGM* *1) A is believed in iF * A: Bel (S' * A) ^ A. 

(AGM* *2) If Bel (dr) A A is satisfiable, then Bel{L' * A) = Bel{'F) A A. 
(AGM* *3) If A is satisfiable, then Bel {dr * A) is also satisfiable. 

(AGM* *4) If Wi = <^2 and Ai = A 2 , then Bel{d/i * Ai) = Bel {d /2 * A 2 ). 
(AGM* *5) Bel (<F * A) A H implies Bel {dr * {A A B)). 

(AGM* *6) If Bel{dr -k A) A B is satisfiable then Bel {dr * (A A B)) implies 
Bel{d'kA)AB. 



Gonsidered superficially, these postulates are exact reformulations of the 
AGM-postulates with “belief sets” replaced by “belief sets of epistemic sta- 
tes” . So the postulates above ensure that the revision of epistemic states is in 
line with the AGM-theory as long as the revision of the corresponding belief 
sets is considered. The most important new aspect in contrast to propositio- 
nal belief revision is given by postulate (AGM* *4) : Only identical epistemic 
states are supposed to yield equivalent revised belief sets. This is a clear but 
adequate weakening of the corresponding AGM-postulate (AGM *4) which 
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only would require the belief sets of 'f'l and W 2 to be equivalent. This would 
amount to reducing the revision of epistemic states to propositional belief re- 
vision which is inappropriate since two different epistemic states , W 2 may 
have equivalent belief sets Bel{'l'i) = Bel {^ 2 )- Thus an epistemic state is not 
described uniquely by its belief set, and revising If'i and tf '2 with equivalent 
belief sets by new information A may result in different revised belief sets 
Bel{^i -k A) ^ Bel {'1^2 as the following example illustrates. 



Example 2 .J 1 ..I. Two physicians have to make a diagnosis when confronted 
with a patient showing certain symptoms. They both agree that disease A is 
by far the most adequate diagnosis, so they both hold belief in A. Moreover, 
as the physicians know, diseases B and C might also cause the symptoms, but 
here the experts disagree: One physician regards S to be a possible diagnosis, 
too, but excludes C, thus accepting the conditionals {B\^A) and {-^C\~^A), 
whereas the other physician is inclined to take C into consideration, but not 
B, so holding belief in (-iBl-iA) and {C\~^A). 

Suppose now that a specific blood test definitely proves that the patient is 
not suffering from disease A. So both experts have to change their beliefs, the 
first physician now takes B to be the correct diagnosis, the second one takes 
C for granted. Though initially the physicians’ opinions may be described by 
the same belief set, {A}, they end up with different belief sets after revision. 

It is important to note that Gardenfors’ famous triviality result [Gar88] 
complaining the incompatibility of the Ramsey test with some of the AGM- 
postulates does not hold if conditional beliefs are considered essentially dif- 
ferent from propositional beliefs, as is emphasized here and elsewhere (see, 
for instance, [DP97a, Lev88]; cf. also [Lew76]). Therefore, obeying the dif- 
ference between Bel{Ei) = Bel{'I' 2 ) and 'f'l = E 2 makes the Ramsey test 
compatible with the AGM-theory for propositional belief revision: Whereas 
Bel (El) = Bel{'E 2 ) only means that both epistemic states have equivalent 
belief sets, 'f'l = 'f '2 requires the two epistemic states to be identical, i.e. to 
incorporate in particular the same conditional beliefs as well as the same 
propositional beliefs. 

Darwiche and Pearl [DP97a] proved a representation theorem for their 
postulates above which parallels the corresponding theorem in AGM-theory 
(cf. [KM91a]), using a generalized notion of faithful assignments: 

Definition 2.4.1. A faithful assignment (for epistemic states) is a function 
that maps each epistemic state E to a total pre-order on the worlds fi 
satisfying the following conditions: 



(1) oji,oj 2 1= Bel(fP) only if iO\ =ip 0 J 2 ,' 
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( 2 ) ^ Bel{'l/) and L 02 ^ Bel{'P) only if u>i W 2 ; 

for worlds uji,uj 2 € f2 and epistemic states S'. 

As usual, coi <i/ W 2 uieans ^ip L 02 and UI 2 oJi; ui\ =>p 0 J 2 iff oj\ 102 
and UJ 2 <jJ\. 

Theorem 2.4.1 ([DP97a]). A revision operator * satisfies postulates 
(AGM* *1) - (AGM* *6) iff there exists a faithful assignment that maps 
each epistemic state W to a total pre-order such that 

Mod IfP -k A) = min(A; <f") := min^,^ (Mod (A)) 

where Mod{W) := Mod{Bel{'P)) , i.e. the worlds satisfying Bel{W -k A) are 
precisely those worlds satisfying A that are minimal with respect to 

This theorem shows an important connection between the pre-ordering 
associated with an epistemic state W and the process of revising by 
propositional beliefs. may be thought of as a plausibility (pre-) ordering 
(or ranking, respectively) providing a representation of the epistemic state, 
that is, as a total pre-order on the set of worlds satisfying conditions (l)-(2) 
of Definition 2.4.1 and the so-called smoothness condition 

min{A; S') yf 0 for any satisfiable A G C (2-H) 

([BG93]), and such that Mod{'P) = min^,^(l7). The smoothness condition 
is also called limit assumption, see [Gro88, Lew73]. Such epistemic states 
correspond to Boutilier’s revision models, as described in [BG93]. 
Because we assume the numbers of possible worlds to be finite, the smoothn- 
ess condition is trivially fulfilled. A more general approach to nonmonotonic 
reasoning and belief revision makes use of an indexed set of possible worlds, 
thus allowing infinitely many possible worlds (cf. [KLM90, Sch97, LMSOl]). 
In [Bou94] , Boutilier considers revision based on pre-orders without requiring 
the limit assumption. Other approaches to epistemic states and belief revision 
(or nonmonotonic reasoning, respectively) are accomplished by considering 
system of spheres [Gro88], epistemic entrenchment orderings [Gar88] and ex- 
pectation orderings [GM94]. Friedman and Halpern [FH99] emphasize the 
need to clarify the ontology underlying a belief change process. They present 
an approach to model dynamic revisions of epistemic states. 

Using the Ramsey test (2.10) , Theorem 2.4.1 immediately yields 

Lemma 2.4.1. A conditional {B\A) is accepted in an epistemic state (<F, 
iff all minimal A-worlds satisfy B, i.e. 

d/ 1= {B\A) iff min(A;iF) C Mod{B) 
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Thus the pre-order encodes the conditional beliefs held in W. For two 
propositional formulas A, B, we define 

A B 

iff for all uj G min(A;!?'), oj' € min(_B;!?'), we have to lo' , i.e. iff the 
minimal A- worlds are at least as plausible as the minimal _B-worlds. is 
a plausibility relation in the sense of Grove (cf. [Gro88, Bou94]), hence dual 
to an epistemic entrenchment relation^ (see [Gar88]). Using this, the lemma 
above may be reformulated as 

Lemma 2.4.2. A conditional {B\A) is accepted in an epistemic state (<F, 
iff AB AB. 

In particular, we have (see Definition 1.3.1) 

Corollary 2.4.1. Let G 17 be two different worlds, let (S', be a 

representation of an epistemic state. 

)= {form{uj)\form{u!,u!')) iff to <xp to' . 

Belief revision of an epistemic state, however, should not only deal with 
the revision of propositional beliefs but also with the modification of the revi- 
sion strategies maintained in that state ([DP97a, Bou93, BG93]). Therefore, 
taking these revision strategies as conditionals, revision of epistemic states 
should be concerned with changes in conditional beliefs and, the other way 
around, with the preservation of conditional beliefs. 

Investigating iterated revision, Darwiche and Pearl [DP97a] explicitly 
took conditional beliefs into account, and they advanced four postulates in 
addition to the AGM axioms to model what may be called conditional pre- 
servation under revision by propositional beliefs: 

DP-postulates for conditional preservation: 

(Cl) lfC\=B then L' ^ {D \ C) iS W i. B ^ {D \ C). 

(C2) If C h B then ip ^ {D \ C) iSPi^B (= {D \ C). 

(C3) lfP\={B\A) then Pi^B {B \ A). 

(C4) IfPirB \= (B \ A) then P \= (B \ A). 

For discussion of these postulates, see the original paper [DP97a]. 



^ Rott [Rot91] investigates belief revision and conditionals within the framework 
of epistemic entrenchment relations. 
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2.5 Probabilistic Reasoning — The ME-Approach 

Though probability distributions have long been appreciated as high-quality 
representations of epistemic states, probabilistic belief change has not really 
been a major topic of the field. This is mainly due to the following reasons: 

— Probabilistic beliefs are much more difficult to deal with than propositional 
beliefs. 

— Probabilistic deduction is complicated, and unfortunately, in many inte- 
resting cases quite meaningless (cf. [Nil86, TGK91]). 

— Even in probabilistic belief change, the paradigm has generally been to 
establish beliefs for certain, i.e. with probability 1 ([Gar88, DP94]). Two 
revision operations have been proposed for this purpose: (Bayesian) con- 
ditioning and imaging (cf. [Gar88]). 

A thorough treatment of uncertain probabilistic beliefs has not yet taken 
place within the area of belief revision theory, most of the work in describing 
and classifying belief change operations has been done for belief sets based on 
classical logics. Gardenfors [Gar88] and Dubois and Prade [DP94] developed 
some axioms for probabilistic belief change, being mostly concerned with revi- 
sing in the sense of establishing facts for certain. While Gardenfors, however, 
claimed that conditioning corresponds to expansion (and thus to a certain 
case of revision) in a probabilistic framework (cf. [Gar88, pp. 105 ff]), Du- 
bois and Prade emphasize that conditioning is not revising but focusing, i.e. 
applying generic or background knowledge to the reference class describing 
properly the case under consideration. Paris [Par94] and Voorbraak [Voo96a] 
also consider probabilistic belief revision in the case of uncertain evidences. 
An important approach to default probabilistic reasoning was developed by 
Adams [Ada75]. His e-semantics satisfies some basic inference schemes for 
nonmonotonic reasoning and proved to yield proper respresentations of sy- 
stem P-inferences (cf. [KLM90]; see also [Pea89]). For a brief overview of 
qualitative probabilistic reasoning, see [Gol94]. 

Nevertheless, probabilistic reasoning and probabilistic belief change has 
been investigated from different point of views in the long tradition of pro- 
bability theory (for surveys, see, for instance, [Pea88, Par94]). 

As a generalization of standard Bayesian conditioning, Jeffrey conditiona- 
lization allows one to modify a probability distribution so as to incorporate 
a changed propositional probability ([Voo96a, Jef83, Par94]): Let P denote a 
probability distribution, and let A € £ be a proposition such that P{A) ^ 0. 
Suppose we learn that the probability of A has changed to x: P'(A) = x. Then 
a revised probability function, P' , taking into account the new information 
while obviously being related to the prior P is given by Jeffrey’s rule: 
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P'{B) = xP{B\A) + (1 - x)P{B\-^A) (2.12) 

for all propositions B G C. Note that for x = 1, we obtain Bayesian conditio- 
ning. The rationale behind Jeffrey’s rule was to preserve conditional degrees 
of belief with respect to A and -^A, i.e. the Jeffrey conditionalization as given 
in (2.12) satisfies 

P'{B\A) = P{B\A) and P'{B\^A) = P{B\^A) 

for all B G C. So, many years before nonmonotonic reasoning and belief revi- 
sion became important topics in Artificial Intelligence, not only had the issue 
of belief change been addressed in probability theory, but also the necessity of 
conditional preservation when revising epistemic states had been perceived. 

Jeffrey conditionalization, however, is only capable of dealing with one 
factual uncertain evidence. It is not apt to manage changes in conditional 
probabilities, nor to adopt a set of uncertain facts simultaneously (cf. [PV92]). 

A powerful tool to realize general changes in probabilistic beliefs has long 
been available: the principle of minimum cross-entropy. The entropy 

H(P) = - E F(w) log P(oj) 

<jJ G 

(with the convention 0 log 0 = 0) of a distribution P first appeared as a phy- 
sical quantity in statistical mechanics and was later interpreted by Shannon 
as an information-theoretic measure of the uncertainty inherent to P (see 
[SW76]; for a historical review, cf. [Jay83a]). It is generalized by the notion 
of cross- entropy 

i?(g,P)= ^Q(u;) log 

(with Olog § = 0 and Q{uj) log = oo for Q{uj) yf 0) between two distri- 
butions Q and P. If Pq denotes the uniform distribution Po{uj) = I/to for all 
worlds ui, then 

R{Q, Pq) = -H{Q) -G log TO 
relates absolute and relative entropy. 

Cross-entropy is a well-known information-theoretic measure of dissimi- 
larity between two distributions and has been studied extensively (see, for 
instance, [Csi75, HHJ92, Jay83a, Kul68]; for a brief, but informative intro- 
duction and further references, cf. [Sho86]; see also [SJ81]). Cross-entropy 
is also called directed divergence since it lacks symmetry, i.e. R{Q, P) and 
R{P, Q) differ in general, so it is not a metric. But cross-entropy is posi- 
tive, that means we have R{Q,P) ^ 0, and R{Q,P) = 0 iff Q = P (cf. 
[Csi75, HHJ92, Sho86]). 
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Consider the probabilistic belief revision problem 

(*prob) Given a (prior) distribution P and some set of probabilistic conditio- 
nals TZ = {{Bi\Ai) [xi], . . . , (Bn\An) [Xn]} C (£ I how should 

P be modified to yield a (posterior) distribution P* with P* ^ TZl 

When solving (*prob), the paradigm of informational economy, i.e. of minimal 
loss of information (see [Gar88, p. 49]), is realized in an intuitive way by 
following the principle of minimum cross-entropy 

min R{Q, P) = Q{oj) log (2.13) 

s.t. (5 is a probability distribution with Q \=TZ 

For a distribution P and some set TZ of probabilistic conditionals compatible 
with P (cf. [Csi75] or Definition 5.1.1 for the details) there is a (unique) 
distribution P. . = P. . {P,TZ) that fulfills TZ and has minimal relative en- 
tropy to the prior P (cf. [Csi75]), i.e. P. . solves (2.13) and thereby (*prob)- 
Note that (*prob) exceeds the framework of the classical AGM-theory with 
regard to several aspects: an epistemic state (P) is to be revised by a set of 
conditionals representing uncertain knowledge. 

Maximizing (absolute) entropy under some given constraints TZ is equi- 
valent to minimizing cross-entropy with respect to the uniform distribution, 
given TZ. Therefore, the principle of minimum cross-entropy can be regarded 
as more general than the principle of maximum entropy 

max H (Q) = — Q{uj) log Q{uj) (2-14) 

CO 

s.t. <5 is a probability distribution with Q \= TZ. 

which solves the problem of representing 77. by a probability distribution 
without adding information unnecessarily. We refer to both principles as the 
ME-principle, where the abbreviation ME stands both for Minimum cross- 
ifntropy and for Maximum ifntropy. 

So, if 77 is a set of conditionals, each associated with a probability, then the 
“best” distribution to represent 77 is the one which fulfills all conditionals in 
77 and has maximum entropy. By an analogous argument, if prior knowledge 
given by a distribution P has to be adjusted to new probabilistic knowledge 
77, the one distribution should be chosen that satisfies 77 and has minimum 
relative entropy to P. 

To justify the ME-principles, a couple of authors have demonstrated 
their usefulness from points of views outside of information theory. So, op- 
timizing entropy is known to yield best expectation values in statistics (cf. 
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[GHK94, Jay83a]). Two other papers [PV90, SJ80] are concerned with cha- 
racterizing the ME-principles as logically consistent inference methods, too. 
Shore and Johnson [JS83, SJ80], succeeded in proving (cross-)entropy to be 
the only functional the optimization of which satisfies four (resp. five) fun- 
damental axioms of probabilistic inference. A similar result is attained for 
entropy in [PV90] by Paris and Vencovska without assuming that inference 
is performed by optimizing a functional, rather basing their characterization 
on seven postulates. In a further paper [PV92], they justify the principle of 
minimum cross-entropy as an appropriate method to update a probability 
function by new uncertain evidence. Independence and invariance properties 
are in the first place among the properties these authors used for characteri- 
zing ME-reasoning. This justifies ME-inference as an inference procedure of 
minimal changes, but little was said about the nature or the extent of changes 
actually occurring under ME-adjustment. 

Applying the ME-principles means to use an appropriate notion of di- 
stance for choosing a “best” representation of new beliefs. In this context, 
it is interesting to mention the paper of Lehmann, Magidor and Schlechta 
[LMSOI] supposing propositional revisions to be obtained from a formal 
(pseudo-)distance. 




3. Conditionals 



This chapter is dedicated to conditionals as objects of crucial concern for 
knowledge representation, plausible reasoning and belief revision. The relati- 
onship between conditionals, epistemic states and beliefs is studied, and we 
develop the formal means to handle conditionals in revision and reasoning. 
In particular, we explain how conditional structures, imposed by conditionals 
on worlds, can be represented appropriately to investigate interrelated effects 
of conditionals. 

Parts of the ideas to be developed here can also be found in other papers 
(see, for instance, [KI98a, KI99c]), but most of the results presented in this 
chapter are new. 



3.1 Conditionals and Epistemic States 

Conditionals {B\A) represent statements of the form “// A then 5” conjoi- 
ning two propositional formulas A, the antecedent or premise, and B, the 
consequent. A lot of different approaches to a logic of conditionals have been 
made (see, for instance, [NutSO, Lew73, DGC94, GGNR91]), also aiming at 
reflecting more general relationships between antecedent and consequent so 
as to capture the manifold meanings of commonsense conditionals. In general, 
conditionals are used to describe plausible relationships between antecedent 
and consequent. Besides such qualitative approaches, the validity of condi- 
tionals may be quantified by degrees of certainty (see [Cal91]). Cox [Cox46] 
argued that a logically consistent handling of quantified conditionals is only 
possible within a probabilistic framework, where the degree of certainty as- 
sociated with a conditional is interpreted as a conditional probability (which 
should not be confused with assigning a probability to the conditional as a 
logical sentence, cf. [Lew76]; for a rigorous discussion of Cox’s Theorem, cf. 
[Hal99a, Hal99b]). 

Conditionals may be given a lot of different interpretations, for instance, 
as counterfactuals, as indicative, subjunctive or normative conditionals etc. 
(see [NutSO, Bou94j). Independently of its given meaning, however, a condi- 

G. Kern-Isberner: Conditionals in NMR and Belief Revision, LNCS 2087, pp. 27-52, 2001. 
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tional {B\A) can be represented as a generalized indicator function on worlds, 
setting 

r 1 : AB 

{B\A){oj) = <^ 0 : AB (3.1) 

[ w : to \= A 

where u stands for undefined (cf. [DeF74, Cal91]). Two conditionals are con- 
sidered equivalent iff the corresponding indicator functions are identical, i.e. 

{B\A) = {D\C) iff A = C and AB = CD 

(cf. [Cal91]) 

This definition captures excellently the three-valued, thus non-classical 
character of conditionals. According to it, a conditional {B\A) is a function 
that polarizes AB and AB, leaving A untouched. Each possible world w either 
confirms {B\A), or refutes it, or is of no relevance for it. So conditionals 
are evaluated with respect to worlds, but considering only single, isolated 
worlds is not enough to decide if a conditional (as an entity) is accepted or 
not. To validate conditionals, we need richer epistemic structures than plain 
propositional interpretations, at least to compare different worlds with regard 
to their relevance for a conditional (see, for example, [NutSO, Bou94, DP97a]). 

An epistemic notion that turned out to be of great importance for condi- 
tionals as well as for belief revision is that of plausibility: conditionals are sup- 
posed to represent plausible conclusions, and plausibility relations on formu- 
las or worlds, respectively, guide AGM-revisions (cf. [NutSO, KM91a, DP97a]; 
for investigating the deep connection between conditional logic and belief re- 
vision theory, see, for instance, [GroSS, FH94]; cf. also Section 2.4, in parti- 
cular the Ramsey test (2.10)). So we assume epistemic states W appropriate 
to study conditionals and belief revision to be at least equipped with a plau- 
sibility pre-ordering (ranking) on worlds (cf. Section 2.4). 

There are others, more sophisticated methods to represent epistemic at- 
titudes; among these appreciated representation forms for epistemic states 
are probability functions, ordinal conditional functions, OCFs and possibility 
distributions ([Gar88, Spo88, DP97a, DP94]): 

Definition 3.1.1. A probability function (or probability distribution^ is a 
map 

P :n^[0,l] 

such that 

E 1 

iJ G ^ 

Each probability function obviously induces a probability measure on 2^ 
and vice versa, and to each propositional formula A G C, a, probability may 
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be assigned by setting 

P{A) = P{.) 

ij\^A 



The set of propositional formulas with probability 1 constitutes the belief set 
of P: 



Bel{P) = {A e £ \ P{A) = 1} 



Probabilities of conditionals are defined via conditional probabilities 



P{B\A) 



P{AB) 

P{A) 



for P{A) yf 0. 



Probability distributions are generally considered to be most adequate 
representations of epistemic states ([Gar88, Spo88]). They use the full scope 
of real numbers between 0 and 1 to specify knowledge, but they are also 
burdened with a lot of numbers. As a qualitative abstraction of probability 
functions, Spohn [Spo88] introduced ordinal conditional functions: 

Definition 3.1.2. Ordinal conditional functions f OOP’s, ranking functions^ 
are functions k from worlds to ordinals such that some worlds are mapped to 
the minimal element 0. 

Throughout this book, we will simply assume that OOP’s are functions 
from the set of worlds to the natural numbers, extended by 0 and oo: 

K : O — >■ N U {0, oo}, 

where oo corresponds to the ordinal loq- Ordinal conditional functions not 
only induce a plausibility pre-order on the set 17 of worlds by oji iff 

k(oji) ^ k(oj 2 ), but furthermore, they specify non-negative integers as degrees 
of plausibility - or, more precisely, as degrees of disbelief - to worlds. The 
smaller k{uj) is, the more plausible the world w appears, and what is believed 
(for certain) in the epistemic state represented by k is described precisely by 
the set Mod{n) := {w G 17 | k(w) = 0}, and consequently, 

Bel{K) = {A & C \ Lo \= A for all u> G Mod(K)} 

Por propositional formulas A, B G £, we set 

k(A) = min{«;(a;) | w |= A}, 

so that 

k(A V B) = min{«:(A), k(B)}. 
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In particular, 0 = min{«;(v4), k{A)}, so that at least one of A or A is conside- 
red mostly plausible. A proposition A is believed iff k{A) > 0 (which implies 
k{A) = 0), so that A is believed iff Bel{K) |= A. We abbreviate this hy n\= A. 
Therefore, ordinal conditional functions not only allow us to compare worlds 
according to their plausibility but also to take the relative distances between 
them into account. So they can be considered as a refinement of the concept of 
simple conditional functions (cf. [Spo88]; see Section 2.4). For the connections 
between ordinal conditional functions and qualitative probabilistic reasoning, 
cf. [Spo88, DP97a, GP96]. 

A conditional may be assigned a degree of plausibility via 



k{B\A) = k{AB) — k{A) 



0 : k{AB) < k{AB) 

k{AB) — k{AB) : k{AB) ^ k{AB) 



Plausibility relations are dual to epistemic entrenchment relations ([Gro88, 
Bou94]) which are essentially equivalent to qualitative necessity relations 
([DP91c, Hoc99]). The quantitative counterpart of necessity relations are ne- 
cessity measures ([Dub86]) the dual of which are possibility measures, i.e. fun- 
ctions 77 : £ — >• [0, 1] observing logical equivalence and such that 7T(T) = 1, 
7T(T) = 0 and 

7T(AVB) =max(7T(A),7T(B)) (3.2) 

(cf. [DLP94]). Each possibility measure 77 is determined by its values on 
possible worlds, 

77(A) = max77({w}), 

due to (3.2). Functions tt : 72 — >• [0, l],7r(a;) = 77({a;}), are also called possi- 
bility distributions. A conditional possibility may be defined by setting 



77(77|A) 



n{AB) 

n{A) 



(3.3) 



for n{A) 0 (cf. [DP94]). For the relationships between possibility theory 
and both probability theory and ordinal conditional functions, see [DP94]. 



If V is the set of atomic propositions under consideration, then let 
be the set of all representations of epistemic states over V of a certain type. 
For example, within a probabilistic framework, would be the set of all 
probability distributions on V. As soon as the type of representation for 
epistemic states is fixed, we will not distinguish between an epistemic state 
and its representation by an element of . Furthermore, two epistemic states 
will be considered equivalent if they are represented by the same element in 
£y. So the type of representation chosen is assumed to reflect all relevant 
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knowledge held in an epistemic state. Note that an epistemic state is supposed 
to be a state of equilibrium which contains all explicitly stated as well as 
all implicitly derived knowledge ([Gar88]). This then demands a complete 
representation of epistemological knowledge. 

To each type of representation of epistemic states, we choose a language 
(£ I £)* that allows a suitable notation of conditionals which is in accor- 
dance with the intended representation of epistemic states. Therefore, for 
instance, for probability distributions and ordinal conditional functions, we 
take (£ | C)* = (£ | C) and (£ | C)* = (£ | £)* " " , respectively, where 

(£ I £)•••• = {{B\A)[x]\A,B&C,x&[QA]} 

and (£ I £) = {{B\A) [n] | A, B G £, n S N U {0, oo}}. 

In a purely qualitative setting, (£ | £)* = (£ | £) seems to be appropriate. 
In correspondence to classical definitions, we set 

= {</) G (£ I £)* I h 

to denote all conditionals accepted in the epistemic state 'F. Here the accep- 
tance relation ^ between epistemic states in £y and conditionals has to be 
specified appropriately. We will deal with this in detail in Section 3.3. 

We further assume {uniqueness assumption) that Th*{F) describes F uni- 
quely (up to representation equivalence): 

Th*{F) = Th*{F) iff F = FmSy (3.4) 

This holds for all the representation types mentioned above, which is easy to 
see in a quantitative environment, and for plausibility pre-orderings we 
have LO <,p to' iff (w|/orm (w, w')) G Th*{F). In general, this assumption is 
justified taking the view that an epistemic state is describable as a response 
scheme to changes in belief and by observing the Ramsey test (cf. Section 2.4). 



3.2 Conditional Valuation Functions 

What is common to probability functions, ordinal conditional functions, and 
possibility measures is, that they make use of two different operations to 
handle both purely propositional information and conditionals adequately. 
Therefore, we will introduce the abstract notion of a conditional valuation 
function to reveal more clearly and uniformly the way in which knowledge 
may be represented and treated within epistemic states. As an adequate 
structure, we assume an algebra A = {A, ©,©,0-^,1-^) of real numbers to be 
equipped with two operations, © and ©, such that 
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— (-4,©) is an associative and commutative structure with neutral element 
0 -^; 

— {A — {O'^},©) is a commutative group with neutral element 

— the rule of distributivity holds, i.e. 

a; © (j/ © 2 :) = (a; © y) © (x © z) 



for x,y,z G A; 

— ^ is totally ordered by which is compatible with © and © in that 

X y implies a; © 2 ; y (B z (3-5) 

X y implies a; © 2 : y Q z (3-6) 

hold for all x,y,z G A. 

So A is nearly an ordered field, except that the elements of A need not be 
invertible with respect to ©. 

Definition 3.2.1. A conditional valuation function is a function 

V :C-gA 

from the set C of formulas into the algebra A satisfying the following condi- 
tions: 

1. t^(-L) = 0'^,y(T) = 1-^, and for exclusive formulas A, B (i.e. AB = A), 
we have 

V{Ay B) = V{A)®V{B)- 

2. for each conditional {B\A) G {C\ C) with V{A) ^ 0'^, 

V{B\A) = V{AB)qV{A)-\ (3.7) 

where V(A)~'^ is the Q-inverse element ofV(A) in A. 

Conditional valuation functions assign degrees of certainty, plausibility, 
possibility etc. to propositional formulas and to conditionals. Making use of 
two operations, they provide a framework for considering and treating condi- 
tional knowledge as fundamentally different from propositional knowledge, a 
point that is stressed by various authors and that seems to be indispensable 
for representing epistemic states adequately (cf. [DP97a]). There is, however, 
no deep conflict between these two different kinds of knowledge (that would be 
unintuitive, even disastrous) - conditionals should rather be regarded as ex- 
tending propositional knowledge by a new dimension, and facts may be consi- 
dered as conditionals of a degenerate form by identifying A with (A\ T). Note 




3.2 Conditional Valuation Functions 



33 



that conditional valuation functions also take this compatibility into account: 
According to Definition 3.2.1, we have V(A|T) = V{A) © (1^)“^ = V{A). 

For each conditional valuation function V , we have 

y(A)= 

oj|— A 

so V is determined uniquely by its values on interpretations or on possible 
worlds, respectively, and we will also write V : A. For all A G A, we 

have 0-^ V(A) 1-^. 

Plausibility relations, as defined in [Gro88] (see also [Bou94]), are most 
appropriate for representing epistemic states and conditionals qualitatively. 
But such relations do not really fit the numerical framework of probability 
theory. So we need a more general notion to integrate conditional valuation 
functions in the field of nonmonotonic reasoning and belief revision. It is easy 
to see that any such function V : £ —> Ais a plausibility measure, in the sense 
of Friedman and Halpern, ([FH96, Fre98]), that is, it fulfills P(-L) P(A) 
for all A G £, and A\= B implies V (A) V (B). 

Two notions which are well-known from probability theory may be gene- 
ralized for conditional valuation functions: 

Definition 3.2.2. A conditional valuation function V is said to be uniform 
ifV{(jj) = V(oj') for all worlds oj,ui' . 

So uniform conditional valuation function assign the same degree of plau- 
sibility to each world. 

Definition 3.2.3. Let V be a conditional valuation function, let A,B,C € C 
such that V{C) yf O'^. A and B are called conditionally independent given 
C (with respect to V ) ifV{A\BC) = V{A\C). 

Some important examples will help to illustrate the newly introduced 
notion of a conditional valuation function: 

Example 3.2.1. Each probability function P may be taken as a conditional 
valuation function 

P '. fl — ¥ T) ■) 0; 1) 

where denotes the set of all non-negative real numbers. Conversely, each 
conditional valuation function V : Q ^ (R+, -I-, •, 0, 1) is a probability func- 
tion. 

Similarly, each ordinal conditional function «: is a conditional valuation 
function 




34 



3. Conditionals 



K : f2 —>• (Z U {oo}, min, +, oo, 0) 

where Z denotes the set of all integers, and any possibility measure II can be 
regarded as a conditional valuation function 

n : 12 ^ (K.’*", max, •, 0, 1). 



Conditional valuation functions not only provide an abstract means to 
quantify epistemological attitudes. Their extended ranges allow us to calcu- 
late and compare arbitrary proportions of values attached to single worlds. 
This will prove quite useful to handle complex conditional interrelationships. 

Probability functions and ordinal conditional functions will serve as our 
standard examples for conditional valuation functions. 



3.3 Conditional Valnation Functions and Beliefs 

By means of a conditional valuation function V : C ^ A, we are able to 
validate propositional as well as conditional beliefs. We may say, for instance, 
that proposition A is believed in V, V \= A, iff V{A) = 1-^, or that the 
conditional {B\A) is valid or accepted inV,V\= {B\A), iff V(A) ^ 0-^ and 
V{AB) V{AB), i.e. iff AB is more plausible (probable, possible etc.) 
than AB. 

IfV = P is a probability function, then saying that A is believed is usually 
associated with P{A) = 1. On the other hand, the qualitative statement “the 
conditional (B\A) is valid in P” might be accepted iff P{AB) < P{AB). But 
both these points of views prove to be not compatible with one another when 
considering propositional beliefs. A, as degenerate conditional beliefs (^|T). 

If F = K is an ordinal conditional function, then stating that k{A) = 0 is 
not sufficient for establishing belief in A - we must have n{A) > 0, which is 
equivalent to stating k{A) < k{A) and thus compatible to the acceptance of 
A as a degenerate conditional belief. 

In a quantitative framework, it is possible - and sometimes appears to be 
more adequate (see, for instance, [Spo88, FH99]) - to specify knowledge more 
exactly by making use of quantified facts A[x], and quantified conditionals 
(B\A) [x], respectively, where x is an element of A. 

In accordance with the remarks above and with generally agreed points 
of views, we say that a conditional {B\A) [x] with x G [0,1] is valid in or 
accepted by a probability function P, 

P h (P|^) N iff P{A) > 0 and P{B\A) = x. (3.8) 

In particular, we obtain for facts ^[x],x G [0, 1], 
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P h iff P{A) = X. 



(3.9) 



Observing that ordinal conditional functions n specify degrees of disbelief, 
a conditional (i?|A)[n] with n G NU {oo} is valid in or accepted by k, 

K \= (B|A)[n] iff k{B\A) = n. (3.10) 

For facts A[n],n S N U {oo}, this means 

K ^ A[n] iff k{A) = n. (3.11) 

This is exactly what Spohn [Spo88, p. 117] calls “A is believed in k with 
firmness n” . 

Note that the exact specification of which beliefs and conditionals are 
accepted depends upon the type of the valuation function used. We will return 
to this issue later in the context of revision (see Section 4.3). 



For probability functions as well as for ordinal conditional functions, it is 
possible to use the notion of validity more vaguely by using inequalities in- 
stead of equalities (see, for example, the system by Goldszmidt and Pearl 
[GP96]). Here we prefer a crisp validity not only for formal reasons (in fact, 
stronger results hold e.g. for the optimum entropy approach in the case of 
equality constraints, cf. [SJ81]). If we follow consequently the argumentation 
of Spohn [Spo88] that not only do qualitative relations matter in representing 
epistemic attitudes but also distances between degrees of plausibility, then 
this same argument should apply to conditionals, too, and we should con- 
sider assigning numerical degrees of disbelief to conditionals as meaningful. 
Similarly for probability functions, statements such as “a conditional {B\A) 
holds with probability 0.8” or “a conditional (B\A) holds with probability 
0.99”, respectively, are usually taken as two different pieces of information. 
Nevertheless, using inequality constraints may be very important and useful 
in modelling vague or incomplete knowledge. 

Let (£ I be the set of all conditionals quantified by elements of A: 



{£ I JT)^ = {{B\A) [x] I (B\A) G (£ I £),x G 



Sets of quantified conditionals will sometimes be marked by a suitable su- 
perscript, for instance, by TZ^ in general, or by TZ C (£ | £) , or by 

TZ‘ ' ' G (£ I C)‘ " " etc. in corresponding settings. Quantified conditionals in 
(£ I £) will be called probabilistic conditionals, and those in (£ | £) 
will be called OCF-conditionals. 
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3.4 A Dynamic View on Conditionals 

Independently of its associated meaning - indicative, subjunctive, counter- 
factual, etc. -, a conditional {B\A) is an object of a three-valued nature, 
partitioning the set of worlds Q in three parts: those worlds satisfying A A B 
and thus confirming the conditional, those worlds satisfying A A ~<B, thus 
refuting the conditional, and those worlds not fulfilling the premise A and 
so which the conditional may not be applied to at all (cf. representation 
(3.1), p. 28). So we define the affirmative set and the conflicting set of the 
conditional {B\A) to be 

{B\A)+ := {uj € AB} = Mod{AB), 

{B\A)~ ■= {uj £ AB} = Mod{AB), 

respectively. Mod{A) is called the neutral set of {B\A). Each of these sets 
may be empty. If {B\Affi = 0, {B\A) is called contradictory, if {B\A)~ = 0, 
{B\A) is called tautological, and if Mod{A) = 0, {B\A) is called a, fact. 

Example 3.4-1. (4|4) is a contradictory conditional, {A\A) is tautological 
and (4|T) is a fact. 

The following lemma is only a reformulation of equivalence by using the 
introduced notions of affirmative and conflicting sets: 

Lemma 3.4.1. Two conditionals (B\A), (D\C) are equivalent iff their cor- 
responding affirmative and conflicting sets are equal, i.e. 

{B\A) = {D\C) iff {B\A)+ = {D\C)+ and {B\A)- = {D\C)- . 

Definition 3.4.1. A conditional {D\C) is called a subconditional of{B\A), 
written as 

{D\C) E {B\A) 

(D|C)+ C {B\A)+ and {D\C)~ C {B\A)~ 

Thus {D\C) E {B\A) if the effect of the former conditional on worlds is 
in line with the latter one, but (D\C) applies to fewer worlds. The E-relation 
may be expressed using the standard ordering ^ between propositional for- 
mulas: 4 ^ i? iff 4 1= U, i.e. iff Mod {A) C Mod{B): 

Lemma 3.4.2. Let (B\A), (D\C) S (£ | £). Then {E>\C) is a subconditional 
of (B\A), (D\C) E {B\A), iff CD ^ AB and CD ^ AB; in particular, if 
{D\C) E {B\A) then C if A. 
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The proof of this lemma is straightforward, and also the following lemmata 
are easily proved: 

Lemma 3.4.3. For any two conditionals {B\A) and (D\C), 

{B\A) = {D\C) ijf (B\A)mD\C) and{D\C)Q{B\A). 

Lemma 3.4.4. The relation □ defines a pre-ordering on {L \ L), i.e. a refle- 
xive and transitive relation, and induces an ordering on the set of equivalence 
classes (£ | £)/= with minimal element (T|_L) and maximal elements (t1|T), 
AgC. 

So we have a unique minimal element (an equivalence class of conditionals, 
respectively) in (£ | £), namely (T|_L), whereas all the facts are maximal 
elements in (£ | £). 

Therefore for any two conditionals (B\A), {D\C) G (£ | £), the infimum 
in (£ I £) with respect to £ 

{B\A)n{D\C) := M{{B\A),{D\C)} 

exists. The supremum of both conditionals, however, 

{B\A)U{D\C) := sup{(B|A),(7?|C)} 

only exists if 

ABCD = ABCD = T (3.12) 

holds, for otherwise {B\A)~^ U (D\C)'^ and {B\A)~ U {D\C)~ would not be 
disjoint. 

Lemma 3.4.5. The supremum {B\A) U (D\C) of two conditionals {B\A) 
and (D\C) exists iff there is a conditional {F\E) such that both {B\A) and 
{D\C) are subconditionals of{F\E). 

In particular, each non-tautological and non-contradictory conditional can 
be represented as the supremum of its basic subconditionals: 

Definition 3.4.2. Basic conditionals are conditionals of the form 

= {form{oj)\form{uj,uj')) (3.13) 

for any two worlds u>,uj' G Q (cf. Definition 1.3.1, p. 9). 

Lemma 3.4.6. Let {B\A) G (£ | £) be a non-tautological and non-contra- 
dictory conditional. Then 

{B\A) = y V’c.x (3.14) 



where ipuj.ui' is defined by (3.13). 




38 



3. Conditionals 



Lemma 3.4.7. For any two non-tautological and non- contradictory condi- 
tionals {B\A),{D\C), it holds that (D\C) C {B\A) iff ipuj,ui' E (^1^) for all 
basic subconditionals of (D\C). 

We omit the straightforward proof of the foregoing lemma and the tech- 
nical proofs of Lemma 3.4.8 and Proposition 3.4.1. 

Lemma 3.4.8. Let {B\A),{D\C) G {C \ C). Then 

{B\A) n {D\C) = {BD I AC{BD V BD)), (3.15) 

and if ABC D = ABCD = _L, then 

{B\A)U{D\C) = {ABy CD\Ay C). (3.16) 

Note that even if ABCD = ABCD = _L does not hold, (3.16) defines an 
operation on conditionals which, however, does not coincide with the supre- 
mum of the corresponding conditionals in this case. 



Example 3.f.2. Consider the (non-tautological) conditionals {B\A) and {B\A) 
violating condition (3.12). A blind application of (3.16) yields {B\A) U 
{B\A) = {A\A), but {A\A)~ = 0, so {A\A) cannot be the supremum of 
\b\A) and (B\A). 

Proposition 3.4.1. U andn, as defined by (3.16) and (3.15), are associative 
and distributive operations on {L\ L), if all required suprema exist. 

Though (£ I £), equipped with U and □, has some convenient algebraic 
properties, it is not a lattice, because the supremum (B\A) U {D\C) is not 
defined for arbitrary conditionals {B\A) and (D\C). In particular, it is not 
a Boolean algebra. But this should not be considered a disadvantage. The 
relation C is not aiming at managing the logical properties of conditionals 
but at capturing their dynamic effects on worlds. (That is exactly the reason 
why the application of U fails in the example above.) And this effect on 
worlds is a crucial factor in the framework of (conditional) belief revision. 
Establishing a conditional {B\A) within an epistemic state F means shifting 
(some) worlds in (B|A)+ and {B\A)~ appropriately. Therefore the changes 
brought about by conditional revision depends on a world’s being in one of 
the sets {B\A)'^,{B\A)~ and Mod{A). 

We will now introduce another relation between conditionals that is quite 
opposite to the subconditional relation and so describes another extreme of 
possible interaction: 
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Definition 3.4.3. Suppose {B\A),{D\C) £ {C\C) are two conditionals. 
(D\C) is called perpendicular to {B\A), 

(d\c)Mb\a) 

iff either Mod{C) C {B\A)^ , or Mod{C) C {B\A)~ , or Mod{C) C Mod{A). 

The perpendicularity relation symbolizes a kind of irrelevance of one con- 
ditional for another one. We have {D\C)AXB\A) if Mod{C), i.e. the range of 
application of the conditional (D\C), is completely contained in exactly one 
of the sets {B\A)'^,{B\A)~ or Mod{A). So for all worlds which {D\C) may 
be applied to, {B\A) has the same effect and yields no further partitioning. 
Note, that _1L is not a symmetric relation; {D\C)AXB\A) rather expresses 
that {D\C) is not affected by (B\A), or, that (B\A) is irrelevant for {D\C). 

The following two lemmata provide characteristic properties of the per- 
pendicularity of conditionals (only the second lemma is proved in the Appen- 
dix): 

Lemma 3.4.9. Let {B\A),{D\C) G (£ | £). Then {D\C)Al{B\A) iff either 
C ^ AB, or C ^ AB, or C ^ A. 

Lemma 3.4.10. Let {B\A),{D\C) £ {C\C) he conditionals, and let 

(D\C) be neither tautological nor contradictory. Then {D\C) _\l{B\A) iff 
f’ui.Lo' -\L{B\A) for all basic subconditionals E {D\C) of (D\C). 

Because of Lemmata 3.4.7 and 3.4.10, one may say that in most cases, 
the relations □ and _1L may be checked by considering basic subconditionals. 

In the following section, we will pursue the idea of conditionals having 
effects on worlds further. As an interesting generalization, however, we will 
deal with sets of conditionals instead of regarding only one conditional at a 
time. 



3.5 Conditional Structures 

Due to their non-Boolean nature, conditionals are rather complicated objects. 
In particular, it is not an easy task to handle the relationships between them 
so as to preserve conditional dependencies “as far as possible” under adap- 
tation. To make the problem clear and to point out a possible way to solve 
it, we give an example which is taken from [Whi90] and which illustrates a 
phenomenon also well-known under the name “Simpson’s paradox”. 
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Example 3.5.1 (Florida murderers). This example is based on a real life in- 
vestigation. During the six years period 1973-79, about 5000 murder cases 
were recorded in the US state of Florida, and the following probability distri- 
bution P mirrors the sentencing policy in those years (for further references, 
cf. [Whi90, pp. 46f|). The propositional variables involved are V = Victim (of 
the murder) is black or white, respectively, v € {vb,Vw} , M = Murderer is 
black or white, respectively, m € , and D = Murderer is sentenced 

to Death, d € {d,d}. 



UJ 


P{iv) 


UJ 


P{u;) 


v^myjd 


0.0151 


Vyjmwd 


0.4353 


v^rubd 


0.0101 


Vyjmbd 


0.0502 


VblTlwd 


0 


Vbtnyjd 


0.0233 


VbTribd 


0.0023 


Vbmbd 


0.4637 



Thus P implies 

F ^ (d|m^) [0.0319] and P ^ {d\mb) [0.0256], 

so justice seemingly passed sentences without respect of color of skin. Diffe- 
rences, however, become strikingly apparent if the third variable V, revealing 
the color of skin of the victim, is also taken into account: 

P \= [0.0335], P h (d|?;„mb)[0.1675], 

P \= {d\vbm^u)[0], P \= (d|w{,m6) [0.0049]. 

If, for instance, the probability of the conditional (d|m6) [0.0236] is to be 
changed, the probabilities of the conditionals (d\vm) containing important 
information should be preserved in an adequate manner. 



This last example illustrates a strange but typical behavior that marginal 
distributions and the conditionals involved may have (we will continue it 
later on, see Example 4.5.1 in Section 4.5). Let us look upon this problem in 
a more abstract environment. 



Suppose F is a probability distribution on a set of variables containing 
a,b, and suppose F ]= (6ja)[x]. In which way may a third variable, c, affect 
this conditional, i.e. what can be said about the probability of (6|ac) in F? 



Roughly, there are two possibilities. In the first case, c does not affect 
(6|a)[a;] at all, that is to say, we have P{b\ac) = P{b\a), showing b and c to be 
conditionally independent given a (cf. Definition 3.2.3), and c to be irrelevant 
for (&ja)[a;]. By a straightforward calculation, we see that P{b\ac) = P{b\a) iff 
P{abc)P{abc) 



P{abc)P{abc) 



= 1. In the second, more usual case, we have F(6jac) P{b\a), 
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P(abc)P(abc) , , . 

and consequently = yt l. Ihus departures Irom conditional in- 

P{abc)P{abc) 

dependence - and thereby the extent of relevance - may be measured by the 

, . P{abc)P{abc) . , , , 

cross product ratio or interaction quotient — ; . A reasonable de- 

P{abc)P{abc) 

mand for a posterior distribution P* adapted to a changed probability of (b\a) 
then is that posterior interaction should be the same as prior interaction, i.e. 

P* (abc) P* (abc) P{abc)P{ahc) ('3 17 ! 

P*{abc)P*{abP} P{abc)P{abc) 

In statistics, logarithms of such expressions are used to measure the interac- 
tions between the variables involved (cf. [Goo63, Whi90]). 

In the general case, we consider joint influences of groups of variables 
(instead of one single variable) on a conditional {B\A), and we have to take 
a set of conditionals into account. Thus the notion of (statistical) interaction 
quotients has to be generalized, involving more worlds both in the numerators 
and in the denominators and being based appropriately on TZ. The comments 
above give interaction quotients a logical meaning that fits the intention of 
this treatise better than a statistical interpretation. It offers a suitable way 
to carry out the necessary generalization from a conditional-logical point of 



In (3.17), two sets of worlds are related to each other with respect to P 
and P*: {abc,abc} in the numerator, and {abc,abc} in the denominator. In 
both sets, the conditional (6|a) is once confirmed (by abc and by abc, respec- 
tively) and once refuted (by abc and abc, respectively), so both sets show the 
same behavior with regard to the revising conditional (&|a)[a;]. This idea of 
a behavior or structure with respect to TZ can be formalized appropriately by 
group-theoretical means, as will be developed in the sequel. 

When we consider (finite) sets of conditionals TZ = {(Bi|Ai), . . . , (B„|A„)} 
C [L\ L) we have to modify the representation given in (3.1), p. 28, ap- 
propriately to identify the effect of each conditional in TZ on worlds in fi. 
This leads to introducing the functions below (see (3.18)) which 

generalize (3.1) by replacing the numbers 0 and 1 by abstract symbols. Moreo- 
ver, we will make use of a group structure to represent the joint impact of 
conditionals on worlds. 

To each conditional {Bi\Ai) in TZ we associate two symbols a+,a“. Let 
Pqi, (a^^ , a^^ , . . . , a^ , a,^ ) 



be the free abelian group with generators aj^",aj", . . . ,a+,a“, i.e. Tn con- 
sists of all elements of the form (aj^)’'i (aj")^i . . . (a+)’'’‘(a“)'*" with integers 
ri,Si G Z (the ring of integers). Each element of Ptz can be identified by its 
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exponents, so that T-n is isomorphic to Z^" (cf. [LS77, FR99]). The commu- 
tativity of Tti corresponds to the fact that the conditionals in 77. shall be 
effective simultaneously, without assuming any order of application. 

For each i, 1 ^ z ^ n, we define a function : 17 — >■ T-n by setting 

r a+ if = 1 

CTi(w) = < if {B^\Ai){uj) = 0 (3.18) 

[ 1 if {Bi\Ai){uj) = u 

ai{ui) represents the manner in which the conditional {Bi\Ai) applies to the 
possible world uj. The neutral element 1 of Tqz corresponds to the non- 
applicability of {Bi\Ai) in case that the antecedent Ai is not satisfied. The 
function 

(Ji : 17 — >• Tti, 

= n = n n 

l<i<n 

describes the all-over effect of 77 on uj. a-jz(oj) is called fa representation of) 
the conditional structure of oj with respect to 77. For each world to, (Jn{uj) 
contains at most one of each sl)) or a~ , but never both of them because each 
conditional applies to w in a well-defined way. The next lemma shows that 
this property characterizes conditional structure functions: 

Lemma 3.5.1. Let a \ Q ^ T he a map from the set of worlds 17 to the 
free abelian group T = (a)^,a)", . . . ,a+,a“) generated by af) ,af , ,af,a~ , 
such that o'(uj) contains at most one of each a)' or a~ , for each world w G 17. 
Then there is a set of conditionals 77 with card {TV) ^ n such that cr = a-jz. 

Example 3.5.2. Let 77 = {(c|a), (c|6)}, where a, b, c are atoms, and let En = 
(a)^,a)",a^,a^). We associate a)*^ with the first conditional, (c|a), and a^ 
with the second one, (c|6). The following table shows the values of the function 
an on worlds to € f2: 



to 


an{i^) 


CO 


an{<^) 


abc 


ai a2 


abc 


o + 
^2 


abc 


ai a2 


abc 




abc 


^1 


abc 


1 


abc 


ar 


abc 


1 



abc confirms both conditionals, so its conditional structure is represented by 
af)a 2 . This corresponds to the product (in En) of the conditional structures 
of the worlds abc and abc. Two worlds, namely abc and abc, are not affected 
at all by the conditionals in 77. 
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The next example illustrates that also multiple copies of worlds may be 
necessary to relate conditional structures: 

Example 3.5.3. Consider the set TZ = {(d|a), (d|6), (d|c)} of conditionals 
using the atoms a, b, c, d. Let ag be the group generators associated 

with (d\a), (d\b), (d|c), respectively. Then we have 

aTz{o.bcd)aTz{abcd)aTz{dbcd) = (aj''aJ)(a^a^)(a^a^) 

= (a+)2(a+)2(a+)2 

= (afa+a+)2 

= aiziabcd)^ 

Here two copies of abed, or of its structure, respectively, are necessary to 
match the product of the conditional structures of abed, abed and abed. 

To compare worlds adequately with respect to their conditional structures, 
we impose a multiplication on the set of worlds 17 by considering the worlds 
Lv as formal symbols. That means, we introduce the free abelian group 17 
generated by all w G 17 

17 := (w I w G 17) (3.20) 

and consisting of all products 

a) = . . Wm’’’", a>i, . . . , Wm G 17, and ri, ... integers. 

Now a-jz may be extended to 17 in a straightforward manner by setting 

CT7?,(w) = CT7?,(wi’'C . . Wm’''”) 

= cttz{ujiY^ . . . aTziuJmY’^ , 

yielding a homomorphism of groups 

(tt^ : 17 — >• En 

For Q = G 17, we obtain the group element 

= n (3.21) 

as representation of the conditional structure of u) . We will often use fractio- 
nal representations for the elements of 17, that is, for instance, we will write 

(jJ-^ T 

— instead of wiWo • 

W2 
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Thus the conditional structure of Q is represented by a group element 
which is a product of the generators a^,a~ of F-j^, with each a^ occurring 
with exponent Efc:a,(^C=a+ and each a" occurring with 

exponent J2k:<ji{u;k)=&r '>'k = J2k-.uik\=AiBi (^ote that each of the sums may 
be empty in which case the corresponding conditional cannot be applied to 
any of the worlds occurring in Q). So the exponent of a^ in ct 7 ?,(w) indicates 
the number of worlds in u) which confirm the conditional (Bi\Ai), each world 
being counted with its multiplicity, and in the same way, the exponent of a“ 
indicates the number of elementary events that are in conflict with (Bi\Ai). 
The elements of f2 replace the multi-sets considered in [KI98a], allowing a 
more coherent and elegant handling of conditional structures. 

In particular, it is possible to isolate the (positive or negative) net impact 
of one conditional in TZ by considering suitable elements of 17, as the following 
example illustrates: 



Example 3.5.4 (continued) . In Example 3.5.2 above, we have 



,abc. 

abc aT 



So — — reveals the positive net impact of the conditional (cla) within TZ, 
abc 

symbolized by a)'’. 

. . , 1 o r o 1 abed ■ abed . , , 

Similarly, in Example 3.5.3, the element — isolates the negative 



net impact of the second conditional, (d\b): 



abed 



.abed ■ abed. a^ ■ a, 

— ) = - 

abed a, ao 



^2 ■ 



The generators af are mere symbols, representing the effects of the cor- 
responding conditional on worlds. If we choose different symbols to 

be associated with the conditionals in TZ, we arrive at a different represen- 
tation homomorphism : 17 — >• = (b+, b(", . . . , b+, b”). But for all 

u)i,u }2 G 17, we have 

= CT7?,(w 2) iff cr'n(fii) = cr)^(w 2 ) 
as can easily be seen from (3.21). This means 

ker an = ker a'n 

where ker a denotes the kernel of a homomorphism a, i.e. 

ker cr := {tD G 17 I cr(tD) = 1} 
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Hence, the kernel of such a representation homomorphism does not depend 
on the symbols chosen as group generators and therefore, it is an invariant 
of the set of conditionals TZ. The kernel ker a-n contains exactly all group 
elements w G 17 with a balanced conditional structure, that means, where all 
effects of conditionals in TZ on worlds occurring in a) are completely cancelled. 

Having the same conditional structure defines an equivalence relation = 7 ^ 
on 17: 

Wl =TI W2 iff = CT7?,(w2)- 

The equivalence classes are in one-to-one correspondence to the elements of 
the quotient group = {w • {ker an) \ w G 17} by observing that 

Qi =n L02 iff = 1 . 

Because the kernel ker an is an invariant of 7Z, =n does not depend on the 
chosen representation either. Therefore, the equivalence class 

= {w' G I (J7?,(w') = an{^)} 

of an element w G 17 is called its conditional structure with respect to TZ. 
According to (3.21), we have 

iff for alH, 1 < i < n, 

Tk= ^ Si and ^ rfc = ^ si. (3.22) 

k:uik\=AiBi k-.u}k\=Ai~Bi l-Mi\=Ai~B[ 



The kernel of an plays an important part in identifying the conditional 
structure of elements 23 G 17, in particular of worlds tv, with respect to the 
set of conditionals TZ. No nontrivial relations hold between different group 
generators a}", a)", . . . ,a+,a“ of iFn, so we have an{<^) = 1 iff ai(d)) = 1 for 
all 3, 1 ^ i ^ n, and this means 

n 

ker an = ker ai (3.23) 

In this way, each conditional in TZ contributes to ker an- The kernel of an, 
however, is not apt to describe uniquely the set TZ of conditionals. 

Example 3.5.5. Consider the four sets 

^0 = {(c|a), (c|&)} 

TZi = {(c|a), (c|&), (c|a)| 

7^2 = {(c|a), (c|&), (c|a), (c| 6 )| 

TZz = {(c|a), (c|&), (c|&)} 
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Some easy calculations show that 
ker a-TZi = 



and 

ker a-Tio = 

So we have ker a-izi = ker aTZ 2 = ker and TZ\ fl 7^2 H 7^2 = 7^o> but 
ker a-Tig ^ ker for z = 1, 2, 3. 



In general, if (7?|^) is a conditional with ker an C ker a(B\A)} then 
ker an = ker anu{(B\A)}, as may easily seen from (3.23). Moreover, Exam- 
ple 3.5.5 above illustrates that it is not always possible to find a minimal 
(with respect to set inclusion) set of conditionals TV with an = an' ■ 

In particular, a conditional {B\A) and its negation {B\A) give rise to the 
same kernel: 

Lemma 3.5.2. Let {B\A) G (£ | £). Then kera(B\A) = kera^^^j^y 



/ abc ■ abc abc ■ abc 
\ abc ■ abc ’ abc ■ abc 
ker an^ = ker an^, 



c, abc, — 



abc 



abc 



abc ■ abc abc ■ abc 



This is evident by considering (3.21). 

The subgroup 

l7o = \ I , W2 G / 

\W2 / 

of 17 generated by all quotients — is of particular interest in connection 

U>2 

with conditionals because it only contains ratios and iterated ratios formed by 
products. It focusses on comparing actions and interactions of conditionals on 
worlds. Thus by considering 17o, it is possible to reveal genuinely conditional 
influences hidden e.g. by normalizing constraints (see Section 3.6; cf. also 
Lemma 3.5.3). The elements of l7o may be described easily: 

m 

l7o = {a) = • . . . • G 17 I ^ rj = 0}. 

7=1 

Two elements tDi = , 0)2 = G 17 are equivalent modulo 

17o, i.e. wil7o = 0 ) 2 ^ 0 , iff = J2i^k^p^k- This means that 

and UJ 2 are equivalent modulo 17o iff they both are a (cancelled) product 
of the same number of generators, each generator being counted with its 
corresponding exponent. 
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Let 



kero '■= ker an H Qq 



be the part of ker an which is included in l?o- kero em is less expressive than 
ker ani for instance, it does not contain all w G 42 with an(<^) = 1- But 
kero ern concentrates on considering ratios as essential entities to reveal the 
influences of conditionals. The following lemma shows that kero ern differs 
from ker an only by taking the conditional tautology (T | T) into regard: 



Lemma 3.5.3. [2q = ker a(^-r\T)> eind kero eJn = ker an\j{Ce\T)} ■ 



So by considering kero ern, implicit normalizing constraints (such as 
P(T) = 1 for probability functions or k(T) = 0 for ordinal conditional func- 
tions) can be taken explicitly into account. 



Finally, we will show how to describe the relations C and _U_ between 
conditionals, introduced in Definitions 3.4.1 and 3.4.3, respectively, by consi- 
dering the kernels of the corresponding a-homomorphisms. As a convenient 
notation, for each proposition A G £ we define 

A = {Q = . . . uj'^ G 12 I Wi ^ A for alH, 1 < t < m}. (3.24) 

Proposition 3.5.1. Let {B\A), (D\C) G (£ | £) be conditionals. 

1. {D\C) is either a subconditional of (B\A) or of {B\A) iff C ^ A and 
ker a(D\c) AC C ker a(^B\A) n C. 

2. (I?|C')_IL(B|A) iffdnCoQ kera(B\A)- 



Within the framework of representing, revising and discovering conditio- 
nal knowledge, the notion of conditional indifference of a conditional valua- 
tion function, to be introduced in the next Section 3.6, will play a central 
part (cf. Sections 4.5, 4.6 and 8.2). For knowledge discovery, it will be cru- 
cial to determine a suitable set TZ of conditionals such that ker an Q Cl\ 

or kero ern C 42i, respectively, for some given subgroup 42i C 42 (ideally, 
equality should hold; see Section 3.6 and Section 8.2 for the details). The set 

7^(42i) = {{B\A) G (£ I £) I a(^B\A)(fi) = 1 for aU u) G 42i} 

= {{B\A) G (£ I £) I 42i C kerai^B\A)} 

is obviously a maximal candidate for such a set TZ. But determining TZ{f2i) is 
quite an expensive task. The next lemma and the following corollary provide 
a first easy criterion for excluding conditionals from being elements of TZ{f2i): 




48 



3. Conditionals 



Lemma 3.5.4. Let (B\A) G (£ | £). For any basic subconditional ipu),uj' E 
{B\A), a^B\A) (^) ^ 1- 

Corollary 3.5.1. If — G ker an, then u,' % (^1^) for any (B\A) G TZ. 
uj' 

In general, for a given element tD G it is easy to find conditionals {B\A) 

^ ^ oil U>r 

with to G ker a/B\A)- Suppose, for instance, w = . . . • — , to be an element 

in 17 q- Set l7i := {coi, . . . ,ujr,ni, . . ■ ,Wr} and A := form{f2i), and choose 
f ?2 E suitably to yield a(^B\A)(fi) = 1 with B := form{fi 2 )- The following 
example will illustrate this. 

Example 3.5.6. Let the alphabet consist of the three atoms a,b,c. Consider 

the element u) = — — ^ — G 17. Set A := form{abc,dbc,abc,dbc) = c, and 
abc ■ abc 

choose, for instance, B\ = form {abc, abc) = ac and B 2 = form {abc, abc) = be. 
Then for each of {Bx\A) = (a|c) and {B 2 \A) = {b\c), we have (7 (b.|a)(w) = 1. 

In this way, considering the elements of subgroups 17i helps us find ap- 
propriate sets of conditionals. But interactions of conditionals considerably 
complicate this task. In Section 8.2, we will present a method that is apt to 
discovering conditional structures for a special but important type of condi- 
tionals. 



3.6 Conditional Indifference 

To study conditional interactions, we now focus on the behavior of conditional 
valuation functions Y \ E ^ A with respect to the “multiplication” 0 in 
A (see Definition 3.2.1, p. 32). Each such function may be extended to a 
homomorphism 

V : 17-(_ -A {A, 0 ) 

by setting 



• . . . • a;™”™) = V{ivi)^^ 0 ... 0 V{co^)^-, 

where 17+ is the subgroup of 17 generated by the set 17+ := {w G 17 | 
V{uj) 0'^}. This allows us to analyze numerical relationships holding bet- 
ween different V (w) . Thereby, it will be possible to elaborate the conditionals 
whose structures V follows, that means, to determine sets of conditionals 
C (£ I £) with respect to which V is indifferent: 
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Definition 3.6.1. Suppose V : C ^ A is a conditional valuation function 
and TZ <^ {£ \ £) is a set of conditionals such that V{A) ^ O'^ for all {B\A) G 

n. 

1. V is strictly indifferent with respect to TZ iff the following two conditions 
hold: 

(i) IfVfjj) = O '^ then there is {B\A) G TZ such that o-(b\a){'^) 1 o,nd 

V{u') = 0-^ for all u>' with (J(^b\a){‘~^') = (b\a){'-^) ■ 

(ii) V{u>i) = ¥{ 0 ) 2 ) whenever a-jiidji) = o"r{^ 2 ) for 0 Ji,cu 2 G 

2. V is (weakly) indifferent with respect to TZ iff V is strictly indifferent 
with respect to TZLI {(T|T)}. 

If V is strictly indifferent with respect to 7^ C [£\ £), then it does 
not distinguish between different elements u}\ , LO 2 with the same conditio- 
nal structure with respect to TZ. Conversely, any deviation V (a)) yf can 
be explained by the conditionals in TZ acting on a) in a non-balanced way. 
Weak indifference means that tautologies are taken explicitly into account; 
it concentrates on ratios which conditionals are based upon. So in many 
applications, it will appear to be the weaker but more adequate notion. 
Using Lemma 3.5.3, V is (weakly) indifferent with respect to TZ iff (i) holds 
and V{bJi) = V(Q 2 ) whenever wiCq = £ 2^0 and = <Jti,{u! 2 ) for 

Wi, W2 G Q+. 

Condition (i) in Definition 3. 6. 1(1) is necessary to deal with worlds oj ^ 
Q+. 

Lemma 3.6.1. If the conditional valuation function V is (strictly or weakly) 
indifferent with respect to TZ, then a-ji(oji) = <Jtz{uj 2 ) implies V{uji) = V{co 2 ) 
for all worlds 0 Ji,ui 2 G 

The following proposition rephrases conditional indifference by establis- 
hing a relationship between the kernels of an and V : 

Proposition 3.6.1. Let TZ £ {£\ £) he a set of conditionals, and let V : 
£ ^ A he a conditional valuation function with V{A) yf O'^ for all {B\A) G 

TZ. 

1. V is strictly indifferent with respect to TZ iff condition (i) of Definition 
3.6. 1(1) holds, and ker an H fl+ C ker V . 

2. V is (weakly) indifferent with respect to TZ iff condition (i) of Definition 
3.6. 1(1) holds, and kero o"R. C C+ C kero V- 

If, in particular, ker an H fl+ = ker V, or kero o'n H 17+ = kero V, 
respectively, then U(tDi) = ¥(£ 2 ) if and only if an(i^i) = <yn{^ 2 ) for ^ 1 , 0)2 G 
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17+ (and ojiQq = u) 2 ^o)- In this case, V completely follows the conditional 
structures imposed by 7?. - it observes 7Z faithfully: 

Definition 3.6.2. Let 7^ C (£ | £) be a set of eonditionals, and let V : £ — > 
A be a eonditional valuation function with V{A) yf 0-^ for all {B\A) G TZ. 

V : £ ^ A is said to be strictly faithful (or (weakly) faithfulj with 
respect to TZ, iff fcer(o) an H 17+ = fcer(o) V. 

We will close this section by characterizing probability functions and or- 
dinal conditional functions with indifference properties: 

Theorem 3.6.1. A probability function P is strictly indifferent with respect 
to a set TZ = {{Bi\Ai), . . . , (77„|T„)} of conditionals iff P{Ai) yf 0 for all 
t, 1 ^ t ^ n, and there are real numbers off ,af , . . . , a+, a~ G R+ such that 

p{iv)= n n ( 3 . 25 ) 

u;\=AiBi uj\=AiBl 



for all u! € fT. 

This theorem can be rephrased using group-theoretical terms: 

Corollary 3.6.1. A probability function P is strictly indifferent with respect 
to a set TZ = {(77i|^i), . . . , (77„|^„)} ijf P{Ai) yf 0 for all iA ^ i ^ n, and 
there is a homomorphism 



such that 

poan = P (3.26) 

Weak indifference with respect to TZ means strict indifference with respect 
to 7^ U {(T|T)}; so we obtain immediately 

Corollary 3.6.2. A probability function P is (weakly) indifferent with res- 
pect to a set TZ — {(73i|7li), . . . , (73„|7l„)} ijf P{Ai) yf 0 for all i,l ^ i ^ n, 
and there are real numbers a^, af, af, . . . , af, af G M“*", Oq > 0; such that 

PA) = ao n n (3-27) 

l^i^n l<i^n 



for all w G 17. 
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Each probability function P has to obey the normalization constraint 
P(T) = 1 which may veil the (inter) actions of conditionals. So weak indiffe- 
rence appears to be particularly appropriate to study conditional structures 
in a probabilistic framework. 

Similar statements and arguments also hold for ordinal conditional func- 
tions and OCF-conditionals. We start with reformulating the central Theorem 
3.6.1 for the OCF-environment: 

Theorem 3.6.2. An ordinal conditional function n is strictly indifferent 
with respect to a set TZ = {{Bi\Ai ), . . . , (B„|T„)} of conditionals iff K{Ai) yf 
oo for all i,l ^ i ^ n, and there are rational numbers nf ,k~ € Q, 1 ^ i ^ n, 
such that 

^ '^i + ( 3 - 28 ) 

uj\=A^Bi co\=AiBl 

for all U! € fi. 

This theorem and the statements to follow are proven in full analogy to 
Theorem 3.6.1 and the corollaries above; so we omit the proofs. 

The numbers k)*", G Q, 1 ^ i ^ n, again may serve to define a suitable 
homomorphism of groups: 

Corollary 3.6.3. An ordinal conditional function k is strictly indifferent 
with respect to a set TZ = . . . , (B„\An)} of conditionals iff k^Ai) yf 

oo for all i,l ^ ^ n, and there is a homomorphism 

K '■ iPiz ^ +) 



such that 



K o an = K 



(3.29) 



For weak indifference, we obtain 

Corollary 3.6.4. An ordinal conditional function k is (weakly) indifferent 
with respect to a set TZ — {(Bi|^i), . . . , (B„|T„)} of conditionals iff K{Ai) yf 

00 for all and there are rational numbers Ko,Kf,K~ G Q, 

1 ^ i ^ n, such that 

k{uj) = ko+ (3.30) 

oj\=AiBi uj\=A^'Bl 



for all U! € f2. 
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Theorems 3.6.1 and 3.6.2, as well as Corollaries 3.6.2 and 3.6.4, give sim- 
ple criteria to check conditional indifference with probability functions and 
ordinal conditional functions. Moreover, they provide intelligible schemas to 
construct conditional indifferent functions. 

Note that the concept of conditional indifference is a structural notion, 
using the numerical values of a conditional valuation function, V, as manife- 
stations of conditional structures imposed by a set TZ. We do not postulate, 
however, that the conditionals in 71 are satisfied by V. This adoption of 
71 will be dealt with in the following chapter, Revising epistemic states by 
conditional beliefs. 




4. Revising Epistemic States by Conditional 
Beliefs 



Usually, the belief sets in AGM theory (cf. Section 2.2) are assumed to be 
deductively closed sets of propositional formulas, or to be represented by one 
single propositional formula, respectively, and the revising beliefs are taken 
to be propositional formulas. So the AGM postulates constrain revisions of 
the form rp* A, the revision operator * connecting two propositional formulas 
Ip and A, where ip represents the initial state of belief and A stands for 
the new information. A representation theorem (see [KM91a]) establishes a 
relationship between AGM revision operators and total pre-orders ^-ip on the 
set of possible worlds, proving the revised belief set ip * A to be satisfied 
precisely by all minimal A-worlds (see also Section 2.2). 

Belief sets represent what is known for certain and are of specific inte- 
rest. They are, however, only poor reflections of the complex attitudes an 
individual may hold. The limitation to propositional beliefs severely restricts 
the frame of AGM theory, in particular, when iterated revisions have to be 
performed. So belief revision should not only be concerned with the revision 
of propositional beliefs but also with the modification of revision strategies 
when new information arrives (cf. [DP97a, Bou93, BG93]). These revision 
strategies may be given implicitly by some kind of preference relation like a 
plausibility ordering or an epistemic entrenchment (cf. Section 2.4), or may 
be taken explicitly as conditional beliefs. Revisions of the complex structure 
of an epistemic state so as to allow iterated revisions are denoted as transmu- 
tations of knowledge systems in [Wil94]. As a counterpart to the paradigm 
of minimal propositional change guiding the AGM postulates, the new pa- 
radigm of preserving conditional beliefs, shortly referred to as conditional 
preservation, arises in the framework of revising epistemic states. 

Darwiche and Pearl [DP97a] explicitly took conditional beliefs into ac- 
count by revising epistemic states instead of belief sets, and they advanced 
four postulates in addition to the AGM axioms as an approach to describe 
conditional preservation under revision by propositional beliefs (cf. the DP- 
postulates on page 22 in Section 2.4). 

In the sequel, we broaden the framework for revising epistemic states (as 
presented, for instance, in [DP97a, Bou94, Wil94]) so as to include also the 
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revision by conditional beliefs. Thus belief revision is considered here in quite 
a general framework, exceeding the AGM-theory in two respects: 

— We revise epistemic states; this makes it necessary to allow for the changes 
in conditional beliefs caused by new information. 

— The new belief A may be of a conditional nature, thus reflecting a changed 
or newly acquired revision policy that has to be incorporated adequately. 

First, we present a scheme of eight postulates appropriate to guide the revi- 
sion of epistemic states by conditional beliefs (cf. Section 4.1). These postu- 
lates are supported mainly by following the specific, non-classical nature of 
conditionals. The aim of preserving conditional beliefs is achieved by studying 
specific interactions between conditionals, represented properly by two relati- 
ons. Because one of the postulates claims propositional belief revision to be a 
special case of conditional belief revision, our framework also covers the topic 
of Darwiche and Pearl’s work [DP97a], and we show that all four postulates 
presented there may be derived from our postulates. We state representation 
theorems for the principal postulates, and to exemplify our ideas, we present 
a conditional belief operator obeying all of the postulates by using ordinal 
conditional functions as representations of epistemic states. 

Like the postulates of Darwiche and Pearl, our postulates aim at de- 
scribing an appropriate principle of conditional preservation to be obeyed 
when revising epistemic states. The main result of this chapter, however, is 
a complete formalization of this principle not only with respect to one re- 
vising conditional, but to a (finite) set of conditionals to be simultaneously 
incorporated into the knowledge system (see Section 4.5). We base this princi- 
ple on representations of epistemic states via conditional valuation functions 
(cf. Section 3.2) by making use of the notion of conditional indifference (see 
Section 3.6). As a particular application of these ideas, we deal with the re- 
presentation of incompletely specified (conditional and factual) knowledge by 
epistemic states in an appropriate way. 



4.1 Postulates for Revising by Conditional Beliefs 

Revising an epistemic state by a conditional {B\A) becomes necessary if a 
new conditional belief, or a new revision policy, respectively, is to be included 
in yielding a changed epistemic state d/' = 'P*{B\A) such that 'P' ^ {B\A), 
i.e. 'P' * A \= B . We will use the same operator * for propositional as well 
as for conditional revision, thus expressing that conditional revision should 
extend propositional revision in accordance with the Ramsey test (RT) (see 
page 18). 
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In this section, we propose a catalogue of postulates a revision of an episte- 
mic state by a conditional should satisfy. The rationale behind our postulates 
is not to minimize conditional change, as in Boutilier and Goldszmidt’s work 
[BG93], but to preserve the conditional structure of the knowledge, as far as 
possible. Here the key idea is to follow the conditionals in !?' as long as there 
is no conflict between them and the new conditional belief. We will make use 
of the relations G (subconditional) and _1L (perpendicularity), introduced in 
3.4, Definitions 3.4.1 and 3.4.3, to relate conditionals appropriately. 

Postulates for conditional revision: 

Suppose d/ is an epistemic state and {B\A), {D\C) are conditionals. Let 'P * 
{B\A) denote the result of revising P by {B\A). 

(CRO) P * {B\A) is an epistemic state. 

(CRl) P * {B\A) ^ (B\A) (success). 

(CR2) P * (B\A) = P iS P \= (B\A) (stability). 

(CR3) P * B := P * (B|T) induces a propositional AGM-revision operator. 
(CR4) P * (B\A) = P * (D\C) whenever (B\A) = (D\C). 

(CR5) If (D\C)A.(B\A) then P ^ (D\C) iff P * (B\A) ^ (D\C). 

(CR6) If (D\C) C (B\A) and P h (D\C) then P * (B\A) (= (D\C). 

(CRT) If (D\C) C (H|A) and P * (B\A) ^ (D\C) then P \= (D\C). 

Postulates (GRO) and (GRl) are self-evident. (GR2) postulates that P should 
be left unchanged precisely if it already entails the conditional. (GR3) says 
that the induced propositional revision operator should be in accordance with 
the AGM postulates. (GR4) requires the result of the revision process to be 
independent of the syntactical representation of conditionals. 

The next three postulates aim at preserving the conditional structure of 
knowledge: 

(GR5) claims that revising by a conditional should preserve all conditio- 
nals to which that conditional is irrelevant, in the sense described by the 
relation _U_. The rationale behind this postulate is the following: The vali- 
dity of a conditional (B\A) in an epistemic state P depends on the relation 
between (some) worlds in Mod(AB) and (some) worlds in Mod(AB) (see 
Lemmata 2.4.1, 2.4.2). So incorporating (B\A) into P may require a shift 
between Mod(AB) on one side and Mod(AB) on the other side, but should 
leave intact any relations between worlds within Mod(AB), Mod(AB), or 
Mod(A). These relations may be captured by conditionals (D\C) not affec- 
ted by (B\A), that is, by conditionals (D\C)AL(B\A). 
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(CR6) states that conditional revision should bring about no change for 
conditionals that are already in line with the revising conditional, and (CR7) 
guarantees that no conditional change contrary to the revising conditional is 
caused by conditional revision. 

An idea of conditional preservation is also inherent to the postulates (Cl)- 
(C4) of Darwiche and Pearl ([DP97a], see also page 22) which we will show 
to be covered by our postulates. 

Theorem 4.1.1. Suppose * is a conditional revision operator obeying the 
postulates (CR0)-(CR7). Then for the induced propositional revision operator, 
postulates (C1)-(C4) are satisfied, too. 

This theorem provides further justifications for the postulates of Darwiche 
and Pearl from within the framework of conditionals. 



4.2 Representation Theorems 

To formulate and prove the representation theorems of this section, we as- 
sume that each epistemic state is equipped with a plausibility pre-ordering 
underlying propositional revision. Thus we actually presuppose Postulate 
(CR3), in observing Theorem 2.4.1, page 21. 

Postulates (CR5)-(CR7) claim specific connections to hold between T and 
the revised T * (B\A), thus relating and We will elaborate this 

relationship in order to characterize those postulates by properties of the 
pre-orders associated with T and T * {B\A). 

Postulate (CR5) proves to be of particular importance because it guaran- 
tees the ordering within Mod{AB), Mod{AB), Mod{A), respectively, to be 
preserved: 

Theorem 4.2.1. The conditional revision operator * satisfies (CR5) iff for 
each epistemic state and for each conditional {B\A) it holds that: 

iff W (4.1) 

for all worlds both being elements of Mod {AB) (or of Mod {AB), or of 
Mod (A), respectively). 

As an immediate consequence, equation (4.1) yields 

Lemma 4.2.1. Suppose (4.1) holds for all worlds lo,lo' both being elements 
of Mod{AB) (or of Mod{AB), or of Mod {A), respectively). Let E € C be a 
proposition such that either E ^ AB, or E ^ AB, or E ^ A. Then 

min{E; E) = min{E; E * {B \ A)) 
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Together with the Ramsey test (RT) (see page 18), (CR5) yields equalities 
of belief sets as are stated in the following proposition: 

Proposition 4.2.1. If the conditional revision operator * satisfies Postulate 
(CR5), then 



Bel{{I' * {B\A)) * AB) = Bel{<P * AB) 

Bel{{I' * {B\A)) * AB) = Bel{<P * AB) 
Bel{{'P*{B\A))*A) = Bel{<P*A) 

For the representation theorems of Postulates (C6) and (C7), we need Postu- 
late (CR5), respectively equation (4.1) and its consequence, Lemma 4.2.1: We 
have to ensure that the property of being a minimal world in the affirmative 
or in the contradictory set associated with some conditionals is not touched 
under revision. 

Theorem 4.2.2. Suppose * is a conditional revision operator satisfying 
(CRB). Let \P be an epistemic state, and let {B\A) he a conditional. 

1. * satisfies (CR6) iff for all lo G Mod(AB), ui' G Mod(AB), oj 

implies to ■ 

2. * satisfies (CR7) iff for all oj S Mod(AB), to' G Mod(AB), ui' 
oj implies lo' <ip lo. 



4.3 Conditional Valuation Functions and Revision 

The theory of revising epistemic states is mainly devised and developed for 
qualitative representations, such as ordinal conditional functions and the like 
(see, for instance, [Bou94, DP97a]). The theorems of the previous section 
are only to be used in such a qualitative framework. Furthermore, revision, 
propositional beliefs and conditional beliefs may be tightly connected via the 
Ramsey test (see Section 2.4, equation (2.10), page 18). 

Conditional valuation functions V \ B ^ A were introduced in Section 3.2 
as abstract representations of epistemic states, with V providing a measure 
of plausibility adequate to model belief revisions (see Section 3.3). Conditio- 
nal valuation function subsume in particular probability functions as well as 
ordinal conditional functions. Therefore, they allow us to consider revisions 
in a quantitative as well as in a qualitative framework. It is worth-while stu- 
dying how conditional valuation functions can be revised in accordance with 
the Ramsey test within both frameworks. 
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An ordinal conditional function k : — >■ N U {0, oo} induces a (proposi- 

tional) AGM-revision operator * by setting 

Mod {k * A) = mm{Mod (A)) (4-2) 

K 

(see Theorem 2.2.2, page 16). The Ramsey test then reads 

k'^{B\A) iff k*A'^B. (4.3) 

This is in accordance with the plausibility relation imposed by n, as the 
following lemma shows: 

Lemma 4.3.1. Let {B\A) he a eonditional in (£ | C), let k he an ordinal 
conditional function. The following three statements are equivalent: 

(i) k'^{B\A). _ 

(a) k{ab) < k{ab). 

(Hi) k{B\A) > 0. 

So K accepts a conditional (via the Ramsey test) iff AB is more plausible 
than AB. The proof of this lemma is immediate by using Lemma 2.4.2, 
page 22. 

In general, we will say that a conditional valuation function V qualita- 
tively accepts a conditional, written as V \= {B\A), iff V{AB) < V{AB) is 
satisfied^. Making this compatible with the Ramsey test means to postulate 

{V * A)(B) < {V * A){B) iff V{AB)<V{AB). 

This suggests that an adequate revision operator should fulfill 

{V*A){B) = aQV(AB) 

for some a G A, a 0-^, and for all B G C with AB ^ T. Observing that 
(V * A)(T) = 1, by definition of a conditional valuation function, this would 
imply a = V{A)~^, i.e. 

V * A(B) = V{B\A) (4.4) 

for allB G C with AB ^ T. So, following a qualitative approach and using the 
Ramsey test, an important quantitative idea arises how to realize the (propo- 
sitional) revision of conditional valuation functions appropriately. Note that, 

^ Note that the degrees of plausibility are measured differently for OCF’s and for 
general conditional valuation functions. Those of conditional valuation functions 
follow the intuitive ordering of real numbers (as is suitable for possibility and 
probability distributions), while those of OCF’s traditionally reverse this orde- 
ring. 
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in spite of the obvious similarity of (4.4) with Bayesian conditionalization 
(see, for instance, (4.7) below), this condition does not force (V *A){B) = 0-^ 
for AB = T, because (4.4) is only assumed to apply for i? € £ with AB ^ _L. 



Things are more complicated for probability functions, P. Here the notion 
of belief may be dealt with in two different, but nevertheless intuitive, ways, 
one by setting 

P\=A iff P{A) = 1, (4.5) 

the other one by setting 

P\=A iff P{A) < P{A) (4.6) 

(see also Section 3.3). While the first approach seems to be most appropriate 
for establishing propositional beliefs, the second one fits better the inten- 
ded meaning of conditional beliefs. Combining both methods with a revision 
operator and applying the Ramsey test, respectively, gives rise to two dif- 
ferent criteria for accepting conditional beliefs. Although (4.6) appears to 
be a proper candidate for realizing qualitative belief revision within a pro- 
babilistic framework, the problem here is that, due to the non-discreteness 
of real numbers, no generally accepted “best” revision operator to establish 
P{A) < P{A) exists (though there is a vast amount of such operators; in Sec- 
tion 4.5, we will at least identify those revision operators being appropriate 
from a conditional logical point of view). 

On the contrary, criterion (4.5) gives rise to one of the oldest and most 
popular revision operators, namely (Bayesian) conditionalization'. 

{P*A){co) = P{u;\A)={ P{A) " (4.7) 

[ 0 if w ^ A 

But conditionalization is not a full revision operator in the ACM sense, it is 
only an expansion (see Section 2.2, page 14) because it does not allow us to 
establish beliefs which are contrary to the certain beliefs held in P. 

So in a probabilistic framework, taking the quantitative degrees of pro- 
bability into account in a belief revision process actually seems to be more 
appropriate than a purely qualitative point of view. That is, instead of consi- 
dering belief revision as restricted to establishing beliefs for certain, we should 
be concerned with revisions of the form P * A[x\, yielding a probability func- 
tion P* such that P*{A) = x. Moreover, a quantitative version of the Ramsey 
test can be formalized as 



P'^{B\A)[x\ iff P*A[l]'^B[x\ 



(4.8) 
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By taking * as conditionalization, we obtain 

P h {B\A) [x] iff P{B\A) = X 

for the acceptance of probabilistic conditionals (confer (3.8), page 34). Further 
generalizations are possible, for instance 

P^{B[x]\A[y\) iff P*A[y]^B[x], (4.9) 

permitting us to deal with generalized conditionals. One method for achieving 
revisions of the form P * A[x] is Jeffrey’s rule (see (2.12), page 24), which 
coincides in this case with the more powerful principle of minimum cross- 
entropy (see (2.13), page 25). The latter technique also allows of realizing 
revisions by (sets of) probabilistic conditionals and thereby constitutes an 
interesting and important example for revision operators of epistemic states. 

One of the aims of this chapter is to develop a formal principle of conditio- 
nal preservation which underlies both qualitative and quantitative revisions 
by conditional beliefs, in that it implies the qualitative axioms (CR5) - (CR7) 
and can also be applied in a purely quantitative framework. 

First, however, we will exemplify the eight postulates (CR0)-(CR7) of 
Section 4. 1 by presenting a revision operator for ordinal conditional functions 
which satisfies all of these postulates. 



4.4 A Revision Operator for Ordinal Conditional 
Functions 



Propositional revisions of ordinal conditional functions k : 17 — >■ N U {0, oo} 
have been investigated by several authors. For instance, Spohn [Spo88] pro- 
posed the following revision operator *5 to establish belief in A with firmness 
n: 



{k A[m]){oj) 



— k{A) if u) \= A, 
K{uf) — k{A) m if ui \= A. 



(4.10) 



Spohn calls k A[m] the (A, m)-conditionalization of k. 

Darwiche and Pearl [DP97a] modified this approach to define a revision 
operator *. . which always strengthens the belief in A: 



{k *. . A){ijj) 



k{uj) — n{A) if Lo \= A, 
K,{uS) -1-1 a ijj \= A. 



(4.11) 



That is, K*. . A = K*s A[k{A) 1]. 

Goldszmidt and Pearl [GP96] considered two different types of proposi- 
tional revision, namely conditionalization of Type-J and conditionalization of 
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Type-L. Their conditionalization of Type-J corresponds exactly to Spohn’s 
{A, m)-conditionalization. 

In the following, we will define a conditional revision operator *o for or- 
dinal conditional functions that satisfy all of the postulates (CR0)-(CR7). In 
particular, *o realizes the idea of conditional preservation developed so far: 

For an ordinal conditional function k : 17 — >■ N U {0, oo} and a conditional 
(B\A), we define n *o (^1^) by setting 

( k(uj) — k{B\A) if a; 1= AB 

«: *0 (7?|A) (w) = < K{uj)+a+l if u: \= AB (4-12) 

1^ k{lS) if u) \= A 

where 

_ f —1, if k{AB) < k{AB), 

\ 0, else 

This revision operator *o generalizes the propositional revision operator *. . 
(see (4.11) above), except for one crucial difference: *. . always strenghtens 
the belief in the revising proposition A, even if A is already believed with 
firmness > 0. So Darwiche and Pearl’s revision operator violates Postulate 
(CR2), while *o complies with it: If already k ^ {B\A), then k*o {B\A) = k, 
due to a = —1 in this case. 

The check of the postulates (CRO) - (CR7) is straightforward, due to the 
representation Theorems 4.2.1 and 4.2.2. So we have 

Proposition 4.4.1. The conditional revision operator *o defined by (4-12) 
satisfies all of the postulates (CRO) - (CR7). 

Example 4-4-1 (Diagnosis). A physician has to make a diagnosis. The pati- 
ent he is facing obviously feels ill, and at first glance, the physician supposes 
that the patient is suffering from disease D causing mainly two symptoms, S\ 
(major symptom) and S 2 (minor symptom). To obtain certainty about these 
symptoms, further examinations will be necessary the next days. The follo- 
wing table shows an ordinal conditional function k representing the epistemic 
state of the physician under these conditions: 



D 


Si 


S2 


K 


D 


Si 


S2 


K 


0 


0 


0 


5 


1 


0 


0 


4 


0 


0 


1 


2 


1 


0 


1 


2 


0 


1 


0 


1 


1 


1 


0 


0 


0 


1 


1 


4 


1 


1 


1 


3 



So in this epistemic state, DS 1 S 2 is considered to be the most plausible world, 
i.e. Bel{K) \= D,Si,S 2 - Moreover, we have k |= (S'i|i7), (Tils'!), (S'2|i7), as 
well as K 1= (Si|il). 
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In the evening, the physician finds an article in a medical journal, poin- 
ting out that symptom D 2 has recently proved to be of major importance 
for disease D. So the physician wants to revise k to incorporate the new 
conditional belief (S 2 \D). 

We make use of the revision operator *0 introduced by (4.12) to calculate 
a revised ordinal conditional function k *0 (S 2 \D). Here we have k{DS 2 ) = 
2,k{DS2) = 0, so k ^ (S 2 \D) and k{S 2 \D) = 2. By (4.12), we obtain 



D 




^2 


n 


K *0 {S2\D) 


0 


0 


0 


5 


5 


0 


0 


1 


2 


2 


0 


1 


0 


1 


1 


0 


1 


1 


4 


4 


1 


0 


0 


4 


5 


1 


0 


1 


2 


0 


1 


1 


0 


0 


1 


1 


1 


1 


3 


1 



Comparing k and k*o{S 2 \D), we see that the physician still believes in D, but 
that his belief in Si is given up in favor of now believing S 2 - Moreover, the 
conditional relationship between D and Si is weakened, (I?|S'i) can no longer 
be entailed from k (S 2 \D), and k *0 (S 2 \D) H (>5'i|-D). The plausibility of 
DSi, however, is still quite high {k*o (S 2 \D){DSi) = 1). Although a common 
occurrence of both symptoms Ai and S 2 is still excluded (k*o ('S'2|T’)(«5'i5'2) yf 
0), its plausibility is raised from 3 to 1. 



The operator *0 defined by (4.12) can be extended to adopt conditionals 
(H|A)[m] quantified by a degree of firmness m > 0: 

( k((jj) — k{B\A) if uj \= AB 

K *0 {B\A)[m] {(m) = < k{uj) + m — k{B\A) if oj \= AB (4-13) 

[ k{uj) if OJ \= A 

It is easy to check that also this extended operator *0 satisfies (a quantitative 

version of) Postulate (CR2). 



4.5 The Principle of Conditional Preservation 

Minimality of change is a crucial paradigm for belief revision, and a “prin- 
ciple of conditional preservation” is to realize this idea of minimality when 
conditionals are involved in change. Minimizing absolutely the changes in 
conditional beliefs, as in [BG93], is an important proposal to this aim, but 
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it does not always lead to intuitive results (cf. [DP97a]). The idea we will 
develop here rather aims at preserving the conditional structure of knowledge 
within an epistemic state which we assume to be represented by a conditio- 
nal valuation function V ■. C ^ A (cf. Section 3.2). The propositions and 
theorems to be presented in this section extend the results of [KI99a]. 

The notion of a conditional structure with respect to a set TZ of condi- 
tionals was defined in Section 3.5, and in Section 3.6, we explained what it 
means for V to follow the structure imposed by TZ on the set of worlds by 
introducing the notion of conditional indifference (cf. Definition 3.6.1). 

Pursuing this approach further in the framework of belief revision, a revi- 
sion of V by simultaneously incorporating the conditionals in TZ, V* = V *TZ, 
can be said to preserve the conditional structure of V with respect to TZ if the 
relative change function V*QV~^ is indifferent with respect to TZ. Taking into 
regard prior knowledge V and the worlds w with V{uj) = appropriately, 
this gives rise to the following definitions: 



Definition 4.5.1. Let V : £ —> A be a conditional valuation function, and 
let TZ he a finite set of (quantified) conditionals. Let V* = V * TZ denote the 
result of revising V by TZ; in particular, suppose that V*{A) yf O'^ for all 
{B\A) G TZ. 

1. V* is called P-consistent iffV{oj) = 0-^ implies V*{uj) = 0-^; V* is called 

strictly P-consistent iffV{u>) = 0-^ P*(u;) = 0-^; 

2. LfV* is V -consistent, then the relative change function (P*/P) ■. fl ^ A 
is defined by 



{V*/V){eo) 



P*(w)0P(w)-i if P(w)yf0-^ 
0-^ if P(w) = 0-^ 



3. V* is strictly indifferent (indifferent) with respect to TZ and P iff V* is 
V -consistent and the following two conditions hold: 

(i) Lf P*(w) = 0-^ then P(w) = 0-^, or there is {B\A) G TZ such 

that a(B\A){'^) ^ 1 <^nd V*{oj') = O'^ for all u>' with = 

CT(B|4)(w). 

(ii) (P*/P)(tDi) = (P*/P)(aj 2 ) whenever ari(ij)i) = (J'r.(Cj 2 ) (and 
u)if2o = Q 2 £^o) for to i,uj 2 G fT)_, where Q'). = (w G 17 | P*(w) yf 0-^). 



A revision V* is said to satisfy the (strict) principle of conditional pre- 
servation with respect to TZ and P iff V* is (strictly) indifferent with 
respect to TZ and V. 

Thus in a numerical framework, the principle of conditional preservation 
is realized as an indifference property. 
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Remark 4.5.1. Though the relative change function {V* /V) is not a conditio- 
nal valuation function, it may nevertheless be extended to a homomorphism 
{y* /V) : y — >■ (-4,©) (see Section 3.6). Therefore, Definition 4.5. 1(3) is an 
appropriate modification of Definition 3.6.1 for revisions. 

Note that the principle of conditional preservation is based only on ob- 
serving conditional structures, without using any acceptance conditions or 
taking quantifications of conditionals into account. 

To illustrate the postulate for conditional preservation once again in a 
probabilistic setting, we will return to the Florida murderers-example (see 
Example 3.5.1 in Section 3.5, page 40). 

Example 4-5.1 (Florida murderers, eontinued). The propositional variables 
involved here are V = Victim (of the murder) is black or white, respectively, 
i) G {vb,Vw} , M = Murderer is black or white, respectively, m G {mb,muj}, 
and D = Murderer is sentenced to Death, d G {d, d}. The following probabi- 
lity distribution P mirrors the sentencing policy in the US state of Florida 
during a six years period: 



cu 


py) 


UJ 


py) 


Vwmnjd 


0.0151 


v-ujrriiud 


0.4353 


VwTTlbd 


0.0101 


Vu,rribd 


0.0502 


Ubm^d 


0 


Vbmyjd 


0.0233 


Vbmbd 


0.0023 


Vbmbd 


0.4637 



Assume that in a following year, we observe a slightly changed relationship 
between mb and d, say (d|mh)[0.03] instead of (d|mt,) [0.0236], and we want 
P to be adjusted to this new information. So we have TZ = {(d|TO6)[0.03]}, 
and let two symbols a+, a~ be associated with TZ. The conditional structures 
with respect to TZ are calculated easily as follows: 

(7Tz{vwmu,d) = anivwmwd) = crn{vbmwd) = an{vbmu,d) = 1, 

(Jn{vy,mbd) = an{vbmbd) = a+, 

an{vu,mbd) = an{vbmbd) = a“. 

Consider the elements Qi = v^mbd ■ Vbrribd and 0)2 = Vbrribd ■ v^^mbd with 
equal conditional structures a-jziuii) = a+a” = (Ttz{oJ 2 ). Therefore for P* to 
be indifferent with respect to TZ and P, it has to satisfy 

P* {vwiTibdfP* {vbmbd) P{vu]mbd)P{vbmbd) 

P*{vbmbd)P*{v-u,mbd) P{vbmbd)P{v-u,mbd) ’ 
which corresponds to equation (3.17), page 41. 
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Thus the concept of conditional structures helps us to get a technically 
clear and precise formalization of the intuitive idea of conditional preserva- 
tion. 



Thus we have developed two formal approaches to realize the idea of 
preserving conditional beliefs, one via the postulates (CR5)-(CR7) presen- 
ted in Section 4.1, the other by applying the concept of conditional indiffe- 
rence appropriately in Definition 4. 5. 1(4). Though Proposition 3.5.1 reveals 
a connection between conditional structures, as introduced in Section 3.5, 
and the relations C and _IL used to formalize (CR5)-(CR7), it still remains to 
make clear the compatibility of both approaches. Theorem 4.5.1 below will 
show that, within a qualitative setting, in the case that TZ consists of only 
one conditional {B\A), any strictly P-consistent revision V *TZ satisfying the 
principle of conditional preservation with respect to TZ and V also obeys the 
postulates (CR5)-(CR7). 

We begin by characterizing revisions V* = V*TZ — V* {B\A) satisfying 
the principle of conditional preservation with respect to TZ = {(R|A)} and 
V . As a basic requirement for such revisions, we will only presuppose that 
V*{A) ^ 0-^, instead of the (stronger) success postulate V* \= {B\A). This 
makes the results to be presented independent of acceptance conditions and 
helps concentrating on conditional structures; in particular, it will be possible 
to make use of these results even when conditionals are assigned numerical 
degrees of acceptance. Note that the principle of conditional preservation 
with respect to TZ does not imply the acceptance of TZ in general. 



Proposition 4.5.1. Let V : C ^ A be a conditional valuation function, and 
let TZ = {(R|A)}, {B\A) G (£ | £) consist of only one conditional. Let V* = 
V * TZ = V * (B\A) denote a revision ofV by (B\A) such that V*{A) ^ O'^. 



1 . 



V* satisfies the strict principle of conditional preservation with respect 
to V and TZ iff there are constants a’*', a~ € A such that 



V*{u;) 



0 V (w) if uj \= AB 
a~ 0 V (w) if u) \= AB 
V (w) if UJ \= A 



(4.14) 



2. V* satisfies the principle of conditional preservation with respect to V 
and TZ iff there are constants ag, a+, G A such that 



V*{cv) 



a^QViuj) if UJ \= AB 
a~QV{uj) if UJ \= AB 
oo G>V {uj) if UJ \= A 



(4.15) 



3. LfV* is strictly V -consistent, then all constants ao,a'^,a € A in parts 
1. and 2. may be chosen yf O'^. 
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Before focussing on qualitative acceptance of conditionals (see Section 4.3), 
we will formulate a quantitative counterpart of postulate (CR5) for conditio- 
nal valuation functions V : 

(CR5«““”‘) If {D\C)MB\A) and V{CD), {V * {B\A)){CD) ^ 0-^, then 
V{CD) © V{CD)-^ = {V * {B\A)){CD) © {V * {B\A)){CD)~\ 

(CR5'^““"*) ensures that essentially, the values assigned to conditionals 
which are perpendicular to the revising conditional are not changed under 
revision: 

Lemma 4.5.1. Suppose the revision V * {B\A) is strictly V -consistent and 
satisfies ('CR5^““”*J. Then for any conditional (I?|C')_IL(B|.4) with V(C) 
it holds that 

V{D\C) = {V * (B\A)){D\C) 

The following proposition shows that substantially, (CR5^““”*) is stronger 
than its qualitative counterpart (CR5): 

Proposition 4.5.2. Let V* = V = V * {(i?|^)} denote a strictly V- 
consistent revision of V by (B\A) such that V*{A) 0'^. If V* fulfills 

(CR5‘^““”*j, then it also satisfies (CR5). 

The following theorem identifies the principle of conditional preservation 
(or conditional indifference, respectively) as a fundamental device to guide 
reasonable changes in the conditional structure of knowledge: 

Theorem 4.5.1. Let V : £ —> A be a conditional valuation function, and 
let TZ = {(RjA)}, (B\A) € {£ \ £), consist of only one conditional. Let V* = 
V *TZ denote a strictly V -consistent revision ofV byTZ fulfilling the postulates 
(CRl) (success) and (CR2) (stability). 

If V* satisfies the principle of conditional preservation, then the revision 
also satisfies postulate ('CR5'^““”*J and the postulates (CR6) and (CR7); in 
particular, it satisfies all of the postulates (CR5)-(CR7) (see Section ).l). 



4.6 C-Revisions and C-Representations 

Essentially, the principle of conditional preservation means the strict indif- 
ference of the relative change function {V* /V) with respect to the revising 
set TZ of conditionals. In the preceding section, we showed that in case that 
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TZ consists of only one conditional, this is compatible with the qualitative 
notion of conditional preservation described by the postulates (CR5)-(CR7). 

In this section, we will return to the more general framework of considering 
the simultaneous revision by a set of (quantified) conditionals, and we will 
take the success condition (CRl), V* ^ TZ, more explicitly into account. In 
particular, for probability functions and for ordinal conditional functions, we 
will present practical schemes to obtain revisions satisfying the principle of 
conditional preservation by making use of the results in Section 3.6. 

Definition 4.6.1. Let V,V* : C ^ A be conditional valuation functions, and 
let TZ he a set of (quantified) conditionals. A conditional valuation function 
V* : £ ^ A is called a (strict) c-revision of R by 7^ iff V* satisfies the 
(strict) principle of conditional preservation with respect to V and TZ, and 

A c-revision is based both on R and TZ, using R as a reference point 
and TZ as & guideline for changes. The prefix “c” marks the conditional well- 
behavedness of such revisions. C-revisions will sometimes also be called c- 
adaptations, in particular within a probabilistic framework (cf. [KI98a]). 

A special case arises if no prior knowledge to be revised is at hand, and the 
set TZ of (quantified) conditionals constitutes the only knowledge available. 
Then the problem is to find a conditional valuation function, i.e. an epistemic 
state, that represents TZ most adequately. This representation problem may 
be considered as a particular revision problem, in taking the uniform condi- 
tional valuation function (see Definition 3.2.2, page 33), Rq, as an appropriate 
conditional valuation function to start revision with. By assigning the same 
degree of probability, plausibility, etc. to each world in i7, Vb represents a 
state of complete ignorance in this framework. 

Definition 4.6.2. A conditional valuation function V* : C ^ A is called a 
(strict) c-representation of a set TZ of (quantified) conditionals, iffV* satisfies 
the (strict) principle of conditional preservation with respect to Vq and TZ, and 
V* j=TZ. 

The following proposition shows that the uniform distribution actually 
plays a neutral part for c-representations, having only a normalizing effect 
on the representing conditional valuation function. Furthermore, the compa- 
tibility of the concepts of conditional indifference, developed for revisions in 
Section 4.5 and for conditional valuation functions in Section 3.6, is stated 
(see, in particular. Definitions 3.6.1 and 4.5.1). 

Proposition 4.6.1. A conditional valuation function V* is indifferent with 
respect to TZ and Rq iff V* is indifferent with respect to TZ. 
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We will now transfer the notions and results of Section 3.6 into the fra- 
mework of revision (see pages 48 ff.). 

Definition 4.6.3. Let V,V* : C ^ A be eonditional valuation funetions, let 
TZ he a set of (quantified) eonditionals. 

1. V* is called a faithful c-revision of T4 by 7?. iffV* is a c-revision ofV by 
TZ satisfying kero o'n 1?+ = kero (V*/V). 

2. V* is called a faithful c-representation of TZ iff V* is a c-representation 
of TZ satisfying kero o"R. bl = kero V* ■ 

Due to their strict obeying of prior knowledge and new information, fai- 
thful c-revisions and c-representations will prove to be especially useful in 
the context of knowledge discovery and data mining (see Chapter 8) . 

The next theorem characterizes revisions of ordinal conditional functions 
that satisfy the principle of conditional preservation. The theorem is obvious 
by observing Corollary 3.6.4, page 51. 

Theorem 4.6.1. Let k, k* be ordinal conditional functions, and let TZ = 
{(i?i|.4i), . . . , (Bn\An)} be a (finite) set of conditionals in {L \ L). 

A revision k* = k*TZ satisfies the principle of conditional preservation iff 
K*{Ai) oo for all i,l ^ i ^ n, and there are numbers ko, k() , k~ € Q, 1 ^ 
i ^ n, such that 

K* (lo) = k{uj) + Ko + ^2 A ^2 (4-16) 

Lo\=AiBi oj\=AiBl 



for all U! € fT. 

Combining Theorem 4.6.1 with Lemma 4.3.1, we obtain 

Corollary 4.6.1. Let TZ = {(Bi |^i), . . . , (B„| A„)} be a (finite) set of con- 
ditionals in (£ I C), and let k be an ordinal conditional function. 

K is a c-representation ofTZ iff K{Ai) oo for all i,l ^ i ^ n, and there 
are numbers Ko,Kf ,k~ € Q, 1 ^ ^ n, such that 

k(w) = Ko + X/ X/ ’ to e O, (4.17) 

l^i^n l^i^n 

w|=A.jB7 



and 
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kJ — Ki < miE_ 



E 4+ E 



— min 

uj\=AiBi 



E 4+ E 






j J 



^\=AjBj J 



A well-known method to calculate an ordinal conditional function apt to 
represent a (finite) set TZ = {ri = (Bi\Ai) | 1 ^ i ^ n} of conditionals is the 
system-Z of Goldszmidt and Pearl ([GP92, GP96]). The basic idea of system- 
Z is to observe the (logical) interactions of the conditionals in TZ which are 
described by the notion of tolerance. A conditional {B\A) is tolerated by a 
set of conditionals S iff there is a world u) such that lo confirms {B\A) and 
iv does not refute any of the conditionals in 5. If 7^ is consistent, then there 
is an ordered partition TZq , TZ\ , . • . , TZk of TZ such that each conditional in 
TZm is tolerated by ® ^ m ^ k (for more details see, for instance, 

[GP96]). 

The system-Z ranking function, , representing TZ is given by 

{ 0, if w does not falsify any ri, 

1-1- max Z(ri), otherwise (4-18) 



where Z{ri) = j iff n G TZj. assigns to each world u the lowest possible 
rank admissible with respect to the constraints in TZ. 

Gomparing (4.18) with (4.17), we see that in general, is not a c- 
representation of TZ, since in its definition (4.18), maximum is used instead 
of summation (see Example 4.6.1 below). The partition TZq, TZ\, . . . , TZk of TZ, 
however, may well serve to define appropriate constants k~ in (4.17). Setting 
Ko := til := 0, and 






mm 



E 

jeUllfoO'’ 



^3 



-kl 



(4.19) 



for each conditional G TZm,rn = 0,...,fc successively, we obtain a c- 
representation of TZ via 



<(w) := E 

1 

u\=AiBi 



(4.20) 
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Example 4-. 6.1 (System Z). Consider the set TZ consisting of the following 
conditionals: 



ri : 


im 


Birds fly. 


T2 ■ 


{b\p) 


Penguins are birds. 


rs ■ 


if\p) 


Penguins do not fly. 


Ti : 


(w\b) 


Birds have wings. 


Tb : 


(a|/) 


Animals that fly are airborne 



(see Example 17 in [GP96, p. 68f]). Here TZ is partitioned by TZq = {^i, r 4 , r^} 
and TZi = {r 2 ,r 3 } (for the details, see [GP96, p. 69]). By applying (4.19), 
we calculate = 1 and = Kg = 2. Actually, in this 

example, k( from (4.20) coincides with the system- Z* ranking function (cf. 
[GMP93, BP99]). We obtain, for example, 

K(.(j)b fwa) = 0, 

K({pbfwa) = Ki = 1, 

Kcipbfwa) = K2 + k(( + k'^ = 5. 

In Table 4.1, we list the ranks of all possible worlds, first computed by system- 
Z, according to (4.18), and then computed as a c-representation of TZ, accor- 
ding to (4.20). Comparing k^{lo) to k({ui), we see that k( is more fine-grained. 
Table 4.1 also reveals that is not a c-representation of TZ: Assigning the 



Table 4.1. Rankings for Example 4.6.1 



LO 


K^{u>) 


«c(w) 


UJ 


K^{u>) 


Kc(w) 


pbfwa 


2 


2 


pbfwa 


0 


0 


pbfwa 


2 


3 


pbfwa 


1 


1 


pbfwa 


2 


3 


pbfwa 


1 


1 


pbfwa 


2 


4 


pbfwa 


1 


2 


pbfwa 


1 


1 


pbfwa 


1 


1 


pbfwa 


1 


1 


pbfwa 


1 


1 


pbfwa 


1 


2 


pbfwa 


1 


2 


pbfwa 


1 


2 


pbfwa 


1 


2 


pbfwa 


2 


4 


pbfwa 


0 


0 


pbfwa 


2 


5 


pbfwa 


1 


1 


pbfwa 


2 


4 


pbfwa 


0 


0 


pbfwa 


2 


5 


pbfwa 


1 


1 


pb fwa 


2 


2 


pb fwa 


0 


0 


pb fwa 


2 


2 


pb fwa 


0 


0 


pb fwa 


2 


2 


pb fwa 


0 


0 


pb fwa 


2 


2 


pb fwa 


0 


0 
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symbols a^, to the conditionals in 7^, 1 ^ t ^ 5, respectively, we obtain 



but 



/ pbfwa-pbfwa \ _ ag a^ ■ a^^ a| a^ 

\pbfwa-pbfwa) a^ aj ag" • a a|a^ 



/ pbfwa ■ pbfwa 
\pbfwa ■ pbfwa 



(pbfwa) + (pbfwa) 

—K^ (pbfwa) — (pbfwa) 
3-2 = 1 0. 



In a probabilistic framework, we obtain the following theorem charac- 
terizing c-revisions of probability distributions, immediately by making use 
of Corollary 3.6.2, page 50 and observing the success condition (see also 
[KI98a]). 



Theorem 4.6.2. Suppose P is a probability distribution and TZ = 
{(i?i|Ti) [xi], . . . ,(Bn\An) [a^n]} is a P-consistent set of probabilistic condi- 
tionals. 

A probability distribution P* is a c-revision of P with respect to TZ if 
and only if there are real numbers ao,ai ,af , . . . ,af,af with og > 0 and 
a^, af , , af,a~ satisfying the positivity condition 



^0, af = 0iffxi = 0, = 0 iff Xi = 1, 1 ^ i ^ n, (4.21) 

and the adjustment condition 

(1-Xi)a+ n n “7 










X! n n oij , 






o}^AjBj 



j¥^i 



such that 

P*(u;) = aoP(co) H II (4-23) 

Lo\=A.^Bi oj\=Ai^ 

for all worlds uj € fT. 

Results for probabilistic and OCF-c-representations are easily obtained 
by using the corresponding uniform function as prior knowledge Pg or kq, 
respectively. The normalizing effect of the uniform prior function may then be 
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subsumed by the constant oq so that no explicit occurrence of prior knowledge 
will be necessary in the formulas (4.23) and (4.16), respectively. 

Theorems 4.6.1 and 4.6.2 above provide a practical method to realize revi- 
sions following the principle of conditional preservation in an OCF- or proba- 
bilistic framework, respectively. Note that they generalize Proposition 4.5.1 
appropriately in so far as now sets of conditionals are considered. Intuitively, 
the involved constants oq, af, , . . . , a+, a~ and kq, nf , , . . . , , k~ , le- 

spectively, should be chosen appropriately to keep the amount of change “mi- 
nimal”. The revision operator *o presented in Section 4.4 (see (4.12), page 
61) obviously intends to meet this requirement. 

In the next chapter, we will develop the ME-revision of a probability 
distribution P by a set of quantified conditionals TZ from the approach (4.23) 
by imposing suitable constraints on the constants ao, af, a^, . . . , a~. 




5. Characterizing the Principle of Minimum 
Cross-Entropy 



Probability theory provides a sound and convenient machinery to be used 
for knowledge representation and automated reasoning (see, for instance, 
[Cox46, DPT90, DP91b, LS88, Pea88, TGK92]). In many cases, only relati- 
vely few relationships between relevant variables are known, due to incom- 
plete information. Or maybe, an abstractional representation is intended, 
incorporating only fundamental relationships. In both cases, the knowledge 
explicitly stated is not sufficient to determine uniquely a probability distri- 
bution. One way to cope with this indetermination is to calculate upper and 
lower bounds for probabilities (cf. [Nil86, TGK92, DPT90]). This method, ho- 
wever, brings about two problems: Sometimes the inferred bounds are quite 
bad, and one has to handle intervals instead of single values. 

An alternative way that provides best expectation values for the unknown 
probabilities and guarantees a logically sound reasoning is to use the prin- 
ciple of maximum entropy resp. the principle of minimum cross entropy to 
represent all available probabilistic knowledge by a unique distribution (see 
Section 2.5; cf. [Sho86, Kul68, Jay83a, GHK94]). Here we assume the avai- 
lable knowledge to constitute of a (consistent) set TZ of conditionals, each 
equipped with a probability, usually providing only incomplete probabilistic 
knowledge. 

The aim of this chapter is to establish a direct and constructive link 
between probabilistic conditionals and their suitable representation via dis- 
tributions, taking prior knowledge into account if necessary. We develope 
the following four principles which mark the corner-stones for using quanti- 
fied conditionals consistently for probabilistic knowledge representation and 
updating: 

(PI) The principle of conditional preservation: this is to express that prior 
conditional dependencies shall be preserved “as far as possible” under 
adaptation; 

(P2) the idea of a functional concept which underlies the adaptation and 
which allows us to calculate a posterior distribution from prior and 
new knowledge; 

G. Kern-Isberner: Conditionals in NMR and Belief Revision, LNCS 2087, pp. 73-90, 2001. 

© Springer- Verlag Berlin Heidelberg 2001 
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(P3) the principle of logical coherence^: posterior distributions shall be used 
coherently as priors for further inferences; and 

(P4) the principle of representation invariance: the resulting distribution 
shall not depend upon the actual probabilistic representation of the 
new information. 

(PI) links numerical changes to the conditional structure of the new infor- 
mation. (P2) realizes a computable relationship between prior and posterior 
knowledge by means of appropriate real functions. (P3) forestalls ambiva- 
lent results of updating procedures, and (P4) should be self-evident within a 
probabilistic framework. 

As we will show, the only method that solves the probabilistic revision 
problem 

i*prob) Given a (prior) distribution P and some finite set of probabilistic 
conditionals TZ = {{Bi\Ai) [xi], . . . , (S„|A„) [x„]} C (£ | C) , how 
should P be modified to yield a (posterior) distribution P* with P* ^ 
7 ^? 

while obeying all of the principles (PI) to (P4) is provided by the principle 
of minimum cross-entropy. The first two axioms (PI) and (P2) will lead to a 
scheme for adjusting a prior distribution to new conditional information, and 
the principles of logical coherence and of representation invariance will be ap- 
plied to this scheme, yielding the desired result. Thus a new characterization 
of the ME-principles is obtained, completely based on probabilistic conditio- 
nals and establishing reasoning at optimum entropy as a most fundamental 
inference method in the area of quantified uncertain reasoning. 

Compared to the earlier papers [PV90, SJ80], the characterization pre- 
sented here points out a more constructive approach to the ME-principles. 
We will show that ME-inference not only respects (conditional) independen- 
cies but that it is basically determined by conditional dependencies (obey- 
ing independence properties where no dependency exists), recommending the 
ME-principles as most adequate methods for reasoning with probabilistic con- 
ditionals. So in contrast to Bayesian networks (cf. e.g. [LS88]), probabilistic 
networks based on ME-techniques (cf. [RKI97b, RM96]) do not require lots 
of probabilities and independence assumptions to process quantified condi- 
tional knowledge properly. Moreover, the methods used in this book are quite 
different from those in [SJ80] and in [PV90]. In particular, there will be no 
need to make use of optimization theory, as in [SJ80], or to transfer the 
problem into the context of linear algebra, as in [PV90]. Our development 
explains clearly how the ME-principles may be completely based on probabi- 



^ this principle is called principle of logical consistency in [KI98a] 
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listic conditionals. This may improve significantly the explanatory features of 
computational systems that use these principles for knowledge representation 
and processing (as, e.g. SPIRIT, cf. [RKI97b, RM96]). 

The results presented in this chapter were published in [KI96a] and in 
[KI98a]. 



5.1 Conditional Preservation 

The principle of conditional preservation (PI) has already been formalized 
and explained in Section 4.5, using the notions of conditional structures of 
worlds and that of indifference with respect to (sets of) conditionals. In 
Section 4.6, Definition 4.6.1, revisions satisfying this principle of conditio- 
nal preservation have been introduced as c-revisions, realizing perfectly a 
conditional- logical approach to the adaptation problem {*prob)'- 

Postulate (PI): conditional preservation 

The solution P* of {*prob) is a c-re vision. 

Theorem 4.6.2 characterizes c-revisions providing a solution to problem 
i*prob) as distributions of the form 

P*{u) = aoP{iv) n n 
1 

where af, a ^, . . . , a~ are real numbers with oq > 0 and af, a ^, . . . , 
a^,a~ satisfying the positivity condition, (4.21), 

af , a~ ^ 0, af = 0 iff = 0, a~ = 0 iff = 1 (5.2) 

and the adjustment condition, (4.23), 



- Xi)at ^ P(w) 


n 


n 




ui\=AiBi 


£ 

II 








n 


n 


(5.3) 






^ kr 

JL 





for 1 ^ t ^ n. Due to their simple structure, c-revisions were taken as an in- 
tuitively appealing approach to realize adaptations to conditional constraints 
in [KI97a]. 
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To maintain compatibility between prior and posterior distributions, P* 
has to be P-consistent (which coincides with the notion of P-continuity, 
denoted as P* <C P; cf. Definition 4.5.1, p. 63), i.e. P(w) = 0 implies 
P*{uj) = 0. Thus to avoid obvious inconsistencies, the set TZ is supposed 
to be P -consistent: 

Definition 5.1.1. A set P C (£ | £) of probabilistic conditionals is said 
to be P-consistent iff there is some distribution Q with Q P and Q \=TZ. 

Definition 5.1.2. For a prior distribution P and some P-consistent set TZ 
of probabilistic conditionals, let C{P,TZ) denote the set of all c-revisions of P 
by TZ: 

C{P,TZ) := {P*|P* is a c-revision ofP byTZ} . 



Remark 5.1.1. Throughout this chapter, we will assume without further men- 
tioning that the necessity of zero posterior probabilities is stated explicitly in 
TZ, i.e. if for any Q P, Q \= TZ implies Q(w) = 0, then P(w) = 0, or there 
is a conditional (B\A) [x] G TZ such that either x = 1 and lo ^ AB or x = 0 
and w 1= AB. 

The ME-solution to {*prob) is the one distribution P. . that satisfies all 
constraints in TZ and has minimal cross-entropy with respect to P, i.e. P. . 
solves the minimization problem 

min R{Q, P)=^ Q(w) log (5.4) 

s.t. (5 is a probability distribution with Q \=TZ 
(see Section 2.5, p. 25). 

If P is a P-consistent (finite) set of conditionals, then the ME-solution 
P. . of (*prob) is guaranteed to exist (cf. [Csi75]). 

The condition Q \= TZ imposed on a distribution Q can be transformed 
equivalently into a system of linear equality constraints for the probabilities 
Q{u}). Using the well-known Lagrange techniques (see, for instance, [Jay83b]), 
we may represent P. . in the form 

R .{io) = aoP{u;) J] H (5-5) 

with the afs being exponentials of the Lagrange multipliers, one for each 
conditional in TZ, and ao = exp(Ao — 1), where Aq is the Lagrange multiplier 
of the constraint Q{uj) = 1- 
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By construction, P. . satisfies all conditionals in TZ\ P. . (Bi|Ai) = Xi, 
which is equivalent to {l — x)P. . (A^B^) = xP. . (A^B^) for alH, 1 ^ ^ n. 
So ai, ... ,an are solutions of the nonlinear equations 



n 

oj\^AiBi 



l=AA 



n 

i#* 

j\=AiBi 



1 — Xi 



E p(^) n 



l — Xi 



n 



(5.6) 



uj\^AiBi 








> 0 : 


Xi 


G(0,1) 




= oo : 


Xi 


= 1 , 1 ^ z ^ n. 


(5.7) 


= 0 : 


Xi 


= 0 




= 1, oo“ 


1 


= 0 and 0° = 1. Oo arises 


simply as 



with 



a normalizing factor. Each ai symbolizes the impact of the corresponding 
rule when P is modified. It depends upon the prior distribution P, the other 
rules and probabilities in TZ and - in a distinguished way - on the probability 
of the conditional it corresponds to. Using the representation formula (5.5) 
above, it is possible to indicate which of the conditionals in TZ actually makes 
a contribution to a conditional information derived from the posterior ME- 
distribution (similar to listing active rules in rule based systems). 



Comparing (5.5), (5.6) and (5.7) to (5.1), (5.3) and (5.2) above, we see 
that the ME-distribution P. . is in particular a c-revision of P by TZ, with 
af = a\~'^' and a~ = a~"^\ 

Because ME-revisions exist for priors P and P-consistent sets TZ, we have 



Corollary 5.1.1. For any prior distribution P and any P-consistent set TZ 
of probabilistic conditionals, C{P,TZ) yf 0. 

Note that we presupposed zero probabilities to be represented explicitly 
(cf. remark after Definition 5.1.2). 

So c-revisions generalize the concept of ME-revisions and embed it into a 
conditional- logical environment. We will make use of probabilistic c-revisions 
in the form (5.1). Distributions of this type will play a major part in the 
following. 

Definition 5.1.3. Let P be a distribution, and let ,af , . . . ,a^,a~ be 
non-negative real numbers such that E -P(‘^) 0 0 ^ 

UJ l^i^n l^i^n 

Then P[af , af, . . . , af] denotes the distribution 

P[at,af,...,a+,af]{uj):=aoP{uj) af a~ 

l^i^n l<i^n 

w|=A.jB7 
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where ao = X) 0 n 

I oj l^i^n 

The normalizing factor «o is completely determined by P and a^, , . . . , 

a^,a~. Note that P[af,ai, . . . ,a^,a~] is P-consistent. 

According to Theorem 4.6.2, for any c-revision P* of P by P, there are 
non-negative real weight factors af, af , . . . , a“ satisfying (5.2) and (5.3) 
such that P* = P[af ,af , . . . , a+, a~]. Define 

wf{P*) := {{at,af,...,a+,a-)eR^-\ 

(a^,af , . . . , aif, a~) satisfies (5.2), (5.3) 
and P* = P[af,af,...,a+,a~]} 

for any P* G C(P,7Z). In general, weight factors of c-revisions are not uni- 
quely determined, so that card(w/(P*)) ^ 1. 

As the proof of Theorem 4.6.2 shows, (5.2) ensures that all premises Ai 
occurring in TZ have positive probabilities in P[af , of , . . . , a~], and (5.3) 

then is equivalent to P[a'^ , af , . . . , a+, a~] \= TZ. 

Corollary 5.1.2. Let P he a distribution, and suppose TZ is a P-consistent 
set of probabilistic rules. 

If af,af , . . . , af, a~ are reals satisfying (5.2) then P[at , af , . . . , af, a~] |= 
TZ iffaf,af,...,a+,a~ fulfill (5.3). 

Therefore, we define 



WF{P,TZ) := U wf{P*) 


(5.8) 


P‘eC{p, TZ) 






(5.9) 



[a^ ,a.^ , . . . , a(), a„ ) satisfies (5.2), (5.3)} 




So, c-revisions actually realize quite a simple idea of adaptation to new 
conditional information: 

When calculating the posterior probability function P*, one only has to 
check the conditional structure of each elementary event tv with respect to 
TZ = {(Pi|Ai) [xi], . . . , (P„|A„) [xn]} C (£ I £)■"■, set up P*(w) according 
to (5.1) with unknown quantities af ,af , . . . ,a)) , a~ and then determine 
appropriate values for these af, of , . . . , a+, a~ using (5.2) and (5.3), so that 
TZ is satisfied. Finally, Oq is computed as a normalizing factor to make P* a 
probability distribution. 

The following example briefiy illustrates this adaptation scheme. 
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Example 5.1.1. Let P be a positive distribution over two atoms a, b, and 
suppose TZ = {(6 1 a) [a;]} with x € (0,1). Applying the formulas above, any 
c-revision P* of P by P may be written as 

P*{ab) = aoP{ab)ai P*{ab) = aoP{ab)ai 
P*{ab) = aoP{ab) P*{ab) = aoP{ab) 



with 

af, >0,(1 — x)af P{ab) = xa^ P{ab) 



and 



ao = —P{ab)af + P{ab) + P{ab) 



So, for instance, we obtain an infinite set of c-revisions of P by P by setting 
simply 

af = xP{ab)m, = (1 — x)P{ab)m with m G N, 



and choosing «o appropriately. 



C-revisions provide a straightforward scheme to calculate solutions to 
the adjustment problem (*prob). ME-revisions are a special instance of this 
scheme, and it is of interest to investigate which of the characteristics of 
ME-distributions also hold for c-revisions in general. 

The author proved in [KI96b, KI98c] that c-revisions possess the pro- 
perties of system independence and of subset independence which both played 
an outstanding part in Shore and Johnson’s [SJ80] characterization of the 
ME-principle. They also cope in an elegant manner with irrelevant informa- 
tion in that posterior marginals are determined only by conditionals involving 
the respective variables (cf. [KI96b, KI98c]; see also [PV90]). All this is due 
to their modular, conditional-logical structure. 

There is another principle that ME-adaptations actually seem to fail at 
first sight and that can now be formulated adequately and proved in terms 
of c-revisions: it is the Atomicity Principle stating that substituting formulas 
for variables shall not affect the adjustment process (cf. [PV97]): 

Theorem 5.1.1 (Atomicity principle). Let V = {Vi,V 2 ,...} and V' = 

{V{, V 2 , . . .} be two finite disjoint sets of binary propositional variables with 
corresponding sets of elementary events Q resp. 17', and let V be ano- 
ther binary variable not contained in either of them. Suppose A € C (V') 
is a propositional formula that is neither a tautology nor a contradiction, 
using only variables in V'. Let TZ = {(Pi|Ai) [xi], . . . , (P„|A„) [x„]} be a 
set of probabilistic conditionals with antecedents Ai and consequences Bi in 
£(VU{P}). Let Af resp. Bf denote the formulas that arise when each 
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occurrence of V in Ai resp. Bi is replaced by A, 1 ^ i ^ n, and so 

n^ = {{Bf\Af)[x,],..., {B^^\A^)[xn ] } C (£ I £)• " • (V U V') . 

Consider the two distributions P' over V U V' and P over V U {V}, re- 
spectively, that are related via P{vuj) = ^ P'{uj'uj), and suppose TZ to be 

oj' 

o^>\=A 

P -consistent. Then TZ^ is P' -consistent, and WF{P,TZ) = WF{P' ,TZ^). 

We omit the straightforward but technical proof. This result emphasizes 
the importance of the weight factors a^, af, . . . , a~ as logical represen- 
tatives of an adaptation scheme. 

In the rest of this chapter, we will investigate which postulates are 
to be imposed on a probabilistic c-revision so as to force the constants 
,af , . . . to take on the ME- form, where the afs are given by 

(5.6). 



5.2 The Functional Concept 

The concept of c-revisions is not perfect - it fails to satisfy uniqueness: Ex- 
ample 5.1.1 above shows that, even in the simple case when dealing with two 
variables and one conditional to be adjusted to, the resulting c-revision is 
not uniquely determined. In general, WF{P, TZ) will contain lots of elements, 
and there will be many different posterior c-revisions. Demanding uniquen- 
ess means to assume a functional concept that guides the finding of a “best 
solution” so that a unique distribution of type (5.1) arises in dependence of 
the prior knowledge P and the new conditional information TZ. 

It is not only the abstract property of uniqueness, however, that makes 
a functional concept desirable. In a fundamental sense, there should be a 
clear and understandable dependence between prior distribution, new (con- 
ditional) information and resulting posterior distribution, i.e. a - somehow 
well-behaved - function F : (P, TZ) i— >■ P* that works for all distributions P 
and all P-consistent sets TZ (cf. [Gar88]). These arguments P and TZ, howe- 
ver, are quite monstrous. The knowledge represented by them is usually huge 
and hard to grasp, let alone introducing such concepts as continuity or even 
differentiability to describe a functional well-behavedness. 

Moreover, P* should depend significantly only on the relevant parts of 
the prior P, i.e. relevant with respect to the new information TZ. Treating 
this problem requires making clear what relevant information is, and how 
irrelevant information should be handled. 

Let TZ = {{Bx\Ai)[xi], . . . ,{Bn\An)[xn]}, and suppose Pi,P 2 are two 
distributions with Pi(o;|^j) = P 2 {u}\Ai) for all w G 17 and for all i = 1, ...,n. 
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Then P\ , P 2 match on all parts which are relevant with respect to TZ, so the 
difference in their posterior relative changes should be insignificant, namely 
a constant (due to possible differences in irrelevant parts): 



Definition 5.2.1. Suppose F : {P,TV) ^ P* is a function that assigns a 
P-consistent distribution P* satisfying TZ to any pair {P, TZ) with P being a 
distribution and TZ representing a P-consistent set of probabilistic rules. Then 
F fulfills the relevance condition iff the following holds: 

Let TZ — {{Bi\Ai) [xi], . . . , (Bn\A„) [xn]}, and suppose Pi, P 2 are two dis- 
tributions with Pi{ijj\Ai) = P 2 {to\Ai) for all uj & Q and for all i = 

Pi{to) = 0 iff P 2 {oj) = 0 and such that TZ is P\- and P 2 -consistent. Let 
Pf := F{Pk,TZ),k = 1,2. Then P*{u!) = 0 iff Pf{uj) = 0, and there is a 
constant const such that 



Piico) 



P2{00) 



= const 



for all u! £ fT with Pf (w) 0,k = 1,2. 



Let AP denote the set of all pairs {P, TZ) representing a solvable adjust- 
ment problem {*prob)' 

AP = |(P, 7^) I P distribution,?^ C (£ | C) ,P P-consistent | 

According to postulate (PI), the solution to the adaptation problem (*prob) 
should be a c-revision. The following proposition shows that the prerequisites 
formulated in Definition 5.2.1 in fact are able to capture the idea of relevant 
information for c-revisions: 



Proposition 5.2.1. Let P = {(Pi|Ai) [xi], . . . , (P„|A„) [x„]}, and suppose 
P\,P 2 are two distributions with Pi{uj\Ai) = P 2 {uj\Ai) for all uj £ fi and 
for all i = Pi(w) = 0 iff P 2 (to) = 0 and such that P is Pi- and 

P 2 - consistent. Then W F{Pi,P) = WF{P 2 ,P). 

Therefore distributions incorporating the same relevant conditional kno- 
wledge have the same sets of weight factors occurring in the corresponding 
c-adaptations. 

Let us henceforth assume that there is a function 

Fc : AP 9 (P,7^) P; G C{P,P) (5.10) 

that assigns to each pair (P, P) G AP a particular c-revision. We will describe 
Fc by specific properties of the weight factors involved. 
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Proposition 5.2.2. Assume {P,TZ) € AV, TZ = {(i?i|2li) [xi], . . . , 

(Bn\An) [xn]} , uud let P* = P[af , , . . . , , a~] £ C{P,TZ) be a c- 

revision of P by TZ with weight factors {af ,af , . . . , af) £ wf{Pf). 

Suppose I U J is a partition of n}, and set Pi := P[a^, 

Then (a+,a“)jgj G WF{Pi,TZj) and 

Pi[a^,aJ]j(zj = P[af,af,...,a+,af]. 

So, once weight factors af, af, . . . , a+, a~ are chosen to yield a “best” 
solution P* G C{P,TZ), they should yield “best” solutions in C{Pi,TZj) (with 
all notations as stated in the text of Proposition 5.2.2). We name this property 
continuity (of solutions): 

Definition 5.2.2. LetFc be as described in (5.10). Fc satisfies the continuity 
condition if the following holds: 

Suppose (P,TZ) £ AP with TZ = {(i?i|Gli) [xi], . . . , (P„|Gl„) [x„]}, and 
assume a() ,af , . . . ,a)[,a~ £ wf(Fc{P,TZ)). Let I U J be a partition of 
n}, and set Pi := TZj := {{Bj\Aj)[xj]}.^j. Then 

£ wf(Fc{Pi,TZj)), i.e. Fc(P/,Pj) = Pi[a^ = Fc(P,P). 

Finally, Fc should obey the principle of atomicity (cf. Theorem 5.1.1): 

Definition 5.2.3. Let Fc be as described in (5.10). Fc satisfies the ato- 
micity condition if for any (P,TZ),(P' ,TZ^) G .4P as in Theorem 5.1.1, 
wf{F,{P,TZ)) = wf{F,{P',TZ^)). 

The following proposition derives necessary conditions for a function Fc 
to fulfill the conditions of relevance, continuity and atomicity in special but 
important cases: 

Proposition 5.2.3. Assume Fc as described in (5.10). 

(i) Suppose Pi, P 2 are positive distributions over two atoms a, b, with Pi(&|a) 
= P 2 {b\a), and let TZ = {(6|a)[x]},x G (0, 1). //Fc satisfies the relevance 
condition, then the weight factors a^,a~ resp. /3+,/3“ o/Fc(Pi,P) resp. 
Fc(P 2 ,P) are equal in pairs, i.e. a’*' = /3+ and a~ = (3~ . 

(ii) Suppose Fc satisfies the conditions of relevance, continuity and atomi- 

city. Let {P,TZ) £ AP with positive prior P such that no variable oc- 
curs both in antecedent and consequent of a conditional in P and all 
assigned probabilities in P are different from 0 and 1. Then the weight 
factors associated in Fc{P,P) with a conditional in P only de- 

pend upon the probability x of this conditional and upon their quoti- 

ent . That means, for any {P,P),{P',P') £ AP, P,P' positive. 
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7 ^ = mA)[xl{B^\A,)[x,],...},n' = {{B'mxUB[\A',)[x\\,...}, 
both sets finite, all of x,Xi,xi € (0,1), no variable occurring both in 
antecedent and consequent of any conditional in TZ and TV , and for 
any weight factors a^,a~ resp. associated in Fc{P,TZ) resp. 

Fc{P',TZ') with the conditional {B\A) [a;] resp. {B'\A')[x\, 

— = — — implies a~^ = a'~^ and a~ = a'~ . 
a a' 



Example 5.1.1 above shows that, in the cases dealt with by Proposition 

X P(ab) 



5.2.3(i), all pairs a+,a of weight factors have to fulfill — = . 

a 1 — X P(abj 

The cross ratio on the right hand side, depending only on prior and new 
conditional probabilities, represents exactly relevant knowledge. The left hand 
side is just the quotient of a~^ and a~ . This gives an intuitive reason for this 
quotient to play a key role, as it is stated in Proposition 5.2.3(ii). 



Thus in the context of c-revisions, we identified clearly the parameters 
weight factors should be dependent on to give rise to a reasonable functional 

a+ 

concept: — and (the probability) x incorporate all relevant knowledge for 
a 

the weight factors. Thus a reasonable functional concept for c-revisions may 
be realized by setting 



= F'^ {x,a) and a =F (x,a) , (5.11) 



with two real positive functions F+ and F , defined on (0, 1) x K+ and 

, , , F+ (x, a) 

related by ; r = a, i.e. 

F~ (X, a) 



F~^{x,a) = aF (x,a). 



(5.12) 



As our global function F : AP 9 (P, TZ) i— >■ P* is to work for arbitrary 
P and TZ, the functions P+ and F~ are assumed to be independent of the 
prior and new information actually present, thus representing a fundamental 
inference pattern. Moreover, to yield “smooth” inferences we assume them to 
be continuous on (0, 1) x K.+ . The functional concept, designed so far, should 
also be applied to the extreme probabilities x G {0, 1}, incorporating classical 
logic as a limit case by assuming 

P“''(0, 0) := lini P+(a;, a) = 0, P“'"(l,oo):= liin P"*" (x, a) G M'*', (5.13) 

and 



F (0,0):=limP (x, q;)g]R^, F (1,oo) 

a:— >^0 



lim P (x,a)=0, (5.14) 

a ;— 
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in accordance with (5.2). The resulting posterior distribution Pp is a c- 
revision of the form 



Pp{uj) = aoP{(jj) F+(cCi,ai) F {xi,ai) (5.15) 

Lj\=AiBi u^\=A^^ 



with non-negative extended real numbers ai, . . . ,a„ € K."*" U {0, oo} solving 
the n equations 






cj\—AiBi 



3^‘i- 

j\=A^Ba 







0!^= < 



1-x* E P{<^) n P^{xj,aj) n F (Xj,aj) 



, Xi yf 0, 1 






0, Xi = 0 

oo, Xj = 1 



(5.16) 

(see (5.1)). Note that at G R+ for Xi G (0,1), because of the positivity 
of both functions F+ and F~ and due to the P-consistency of TZ. So the 
positivity condition (5.2) is satisfied, and (5.16) corresponds to the ad- 
justment condition (5.3) here. Thus, for any n non-negative extended real 
numbers ai,...,a„ G U{0,oo}, is a solution of (5.16) iff 

P+ {xi, ai ) , F~ {xi, ai ) , . . . , F~^ (x„, a„) , F~ (x„, a„) is a solution of (5.3) 
satisfying (5.2). 



We summarize these remarks for the axiomatization of the second postu- 
late (P2): 



Postulate (P2): functional concept for c-revisions 



There is a function F* : AP 9 {P,P-) P* & C{P,TZ) that assigns to 
each adjustment problem (P,TZ) G AP a particular c-revision Pp satisfying 
P = {(Pil^i) [xi], . . . , (Bn\An) [xn]}- F* is given by two real positive and 
continuous functions F^ and F~ defined on (0, 1) x R+, fulfilling the condi- 
tions (5.13) and (5.14) and related by (5.12), in that Pp = F*(P, P) =: P*pP 
has the form (5.15) with ai, . . . ,a„ G U {0, oo} solving (5.16). 

Define for (fixed) F*,P+,P“ as in (P2) and for (P,P) G AP, P = 
{{Bi\Ai) [xi], . . . , (P„|A„) [Xn]}, 

WQf{P,P) :={(«!,..., a„) GM+U{0 ,oo}| (5.17) 

(«i, . . . , an) solves (5.16)} 

to be set of all weight quotients that belong to c-revisions 
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P[ai, . . . := ao-P(w) F+ {xi^ai) F {xi,ai) . 

uj\=AiBi u)\=A^Bl 

So (ai,...,a„) G WQf{P,TV) iff (J^+ (xi, ai) , F“ (xi, ai) , . . . , 
F+ (x„,q;„) ,F~ (a;„,a„)) G WF{P,TZ). 

For any (P, TZ) G AP, F*(P, TZ) = P[ai , . . . , q;„]f is described by a parti- 
cular element (o;i,...,a„) G WQf{P,P)- This disagreeable dependence on 
a special yet unknown solution of (5.16) may be overcome by assuming that 
the functions F'*' and F~ fulfill the condition of uniqueness: 

Definition 5.2.4. Let F+ and F~ be functions as described in (P2). F+ 
and F~ satisfy the uniqueness condition iff whenever (P,TZ) G AP and 
(oi, . . . , an), {Pi , . . . , Pn) C WQf{P, P) it holds that 

P[ai , . . . , q;„]f = P[Pi , . • . , Pn]F- 

So, if F+ and F~ satisfy the uniqueness condition then F* is determined 
by (5.16), i.e. F*{P,TZ) = P[ai ,. . . ,q;„]f for any (oi, . . . ,a„) G WQf{P,P)- 

The functional concept (P2), describing weight factors of c-revisions, was 
initiated by the conditions of relevance, continuity and atomicity. The uni- 
queness condition ensures recovering these properties from (P2): 

Proposition 5.2.4. Let F* : AP 9 (P,P) Pp & C{P,P) be as in (P2) 
with associated functions F+ and F~ satisfying the uniqueness condition. 
Then F* fulfills the conditions of relevance, continuity and atomicity. 

We will see in Section 5.5 that the condition of uniqueness will in fact 
be satisfied for the special functions F+ and F~ that will be determined by 
(P2) together with (P3) and (P4) (cf. Proposition 5.5.1). 

Note that the conciseness of (P2) is essentially due to making use of 
c-revisions. So the efforts we invested in developing this conditional- logical 
concept begin to pay, providing now an elegant functional concept. 

After having put the functional dependencies in concrete terms we are 
now going to study which properties the functions F~^ and F~ should have 
to guarantee reasonable probabilistic inferences. To simplify notation, we will 
usually prefer the operational P *fP to the functional F*{P,P), where *f 
is described by (P2). 



5.3 Logical Coherence 

Surely, the adaptation scheme (5.15) will be considered sound only if the re- 
sulting posterior distribution can be used coherently as a prior distribution 
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for further adaptations. In particular, if we first adjust P only to a subset 
Pi Q P, and then use this posterior distribution to perform another ad- 
aptation to the full conditional information TZ, we should obtain the same 
distribution as if we adjusted P to 7?. in only one step. 

We state this demand for logical coherence as Postulate (P3): 

Postulate (P3): logical coherence 

For any distribution P and any P-consistent sets Pi,P 2 C (£ | £) , the 

(final) posterior distribution which arises from a two-step process of adjusting 
P first to Pi and then adjusting this intermediate posterior to Pi U P 2 is 
identical to the distribution resulting from directly revising P by Pi U P 2 - 

More formally, the operator satisfies (P3) iff the following equation holds: 

P {Pi U P 2 ) = (P *F Pi) *F {Pi U P 2 ) ■ (5.18) 

Applying the principle of logical coherence will give us a further important 
result in determining the functions F~^ and F~: 

Theorem 5.3.1. If the revision operator *p satisfies the postulate (PS) of 
logical coherence then 



J^-(0,0) = P+(l,oo) = 1, (5.19) 

and F~ necessarily fulfills the functional equation 

F~ {x, a/3) = F~ {x, a) F~ {x, (3) (5.20) 

for all x £ {0,l),a, P £ R+. 

Because of (5.12), P+ satisfies F^{x,aP) = F^ {x,a)F^ {x, P) iff F~ 
satisfies (5.20). 

Theorem 5.3.1 is proved by checking conditions (5.19) and (5.20) in the 
very special case that P is a positive distribution over three variables a, 6, c, 
and TZi,TZ 2 are given by Pi = {(c|a)[a;]} and P 2 = {(c|6)[y]}. These condi- 
tions are necessary to guarantee a logical consistent behavior of the revision 
process for this example, and because we assumed the functions P“*", F~ to be 
independent of the actual case we thus proved the general validity of (5.19) 
and (5.20). In fact, there is little arbitrariness in choosing this special ex- 
ample which such a crucial meaning is assigned to. The way in which two 
conditionals with common conclusion should interact is one of the main is- 
sues in conditional logic and refers to the antecedent conjunction problem (cf. 
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[NutSO, Spi91]). The validity of (5.19) and (5.20) ensures a sound probabilistic 
treatment of this problem. 

The functional equation (5.20) restricts the type of the function F~ (and 
that of F+, too,) essentially: 

Proposition 5.3.1. Let *f be o, revision operator following (P2) such that 
F+ and F~ satisfy (5.19) and (5.20). Then there is a (continuous) real fun- 
ction c(x),x G (0,1), with 

lim c(x) = 0 and lim c(x) = — 1 (5-21) 

tC— >-0 X—¥l 

such that 

F~ {x,a)=a‘^^^'> and F+ (x, a) = (5.22) 

for any positive real a and for any x G (0,1). Especially for a = 1, this 
implies 

F~{x,l) = F+{x,l) = l. (5.23) 

In Theorem 5.3.1 we showed how the coherence property (5.18) determines 
the part the quotients have to play in the revision process. We are now 
left with the investigation of the isolated impact of the numbers Xi which 
represent posterior conditional probabilities. 



5.4 Representation Invariance 

So far, we have largely neglected how (conditional) knowledge is represen- 
ted in TZ. Indeed, the principle of atomicity deals with logical equivalence of 
propositional formulas, but what about probabilistic equivalences, i.e. equi- 
valences that are due to elementary probability calculus (see Section 1.3.2)? 
For instance, the sets of rules {{B\A) [x],4[y]} and {AB[xy\, A[y]} are equi- 
valent in this respect because each rule in one set is derivable from the rules 
in the other set. We surely expect the result of the revision process to be 
independent of the syntactic representation of probabilistic knowledge in TZ\ 

Postulate (P4): representation invariance 

If two P-consistent sets of probabilistic conditionals TZ and TZ' are probabili- 
stically equivalent then the posterior distributions P*TZ and P*TZ' resulting 
from adapting the prior P to TZ and to TZ' , respectively, are identical. 

The notion of probabilistic equivalence used here completely corresponds 
to that introduced in [PV90]. Using the operational notation, we are able to 
express (P4) more formally: 
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The revision operator *p satisfies (P4) iff 

P*fTI = (5.24) 

for any two P-consistent and probabilistically equivalent sets TZ,TZ' C 

(£ I £)••". 

The demand for independence of syntactic representation of probabilistic 
knowledge (P4) gives rise to two functional equations for c{x) (cf. Proposition 
5.3.1): 

Proposition 5.4.1. Let the functions F~ and describing the revision 
operator *f in (P2) satisfy (5.22) with a continuous function c(x) fulfilling 
(5.21). If *F satisfies postulate (P)) resp. (5.2)) then for all real x,xi,X 2 € 
(0, 1) the following equations hold: 

c{x) + c{l — x) = —1 (5.25) 

c{xxi -I- (1 — x)x 2 ) = —c(x)c(xi) — c(l — x)c{x 2 ) (5.26) 

The most obvious probabilistic equivalence is that of each two rules 

{B\A) [x] and (P|^)[l — x\. This implies (5.25). Equation (5.26) is again 

proved by investigating a special but crucial revision problem. The relation 

P{h\a) = P{b\ac)P{c\a) + P{h\ac)P{c\a) 

for arbitrarily chosen propositional variables a, b, c is fundamental to proba- 
bilistic conditionals, yielding the probabilistic equivalence of the two sets 

n = {(c|a)[x], (6|ac)[xi], (6|ac)[x2]} , 

and n' = {(6|a)[i/], (6|ac)[xi], (6|ac)[x2]} 

with y = xx\ -I- (1 — x)x 2 for real x,xi,X2 G (0, 1). The validity of (5.24) in 
this case necessarily implies (5.26) (see Appendix). 

As a consequence of (5.25) and (5.26), we finally obtain: 

Theorem 5.4.1. If the operator *f in (P2) is to meet the demands for logi- 
cal coherence (PS) and for representation invariance (P)), then F^ and F~ 
necessarily have the forms 

F~^ (x,a) = and F~{x,a)=a~^, (5.27) 

respectively. 

The continuity of the functions P+ and F~ is essential to establish this 
theorem (see Appendix). This means in particular the continuous integrating 
of the extreme probabilities 0 and 1, that is, the seamless encompassing of 
classical logic. 
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5.5 Uniqueness and the Main Theorem 

So far we have proved how the demands for logical coherence and for repre- 
sentation invariance constrain the functions which we assumed to underly the 
adjusting of P to TZ, as described by the functional concept (P2). Applying 
(5.27) to (5.15) and (5.16) we recognize that the posterior distribution neces- 
sarily is of the same type (5.5) as the ME-distribution if it is to yield sound 
and coherent inferences. 

Therefore, we nearly have reached our goal. But one step is still missing: 
Is this enough to characterize ME-inference within a conditional-logical fra- 
mework? Are there possibly several different solutions of type (5.5), only one 
of which is the ME-distribution? And moreover, if we assume the functions 
F+ and F~ to fulfill (5.27), is this sufficient to guarantee that the resulting 
operator *p satisfies logical coherence and representation invariance? 

The question of uniqueness of the posterior distribution is at the center of 
all these problems. If it can be answered positively, we will have finished: The 
unique posterior distribution of type (5.5) must be the ME-distribution, *p 
then corresponds to ME-inference, and ME-inference is known to fulfill (5.18) 
and (5.24) as well as many other reasonable properties, cf. [PV90, SJ80, SJ81]. 
Moreover, together with (P2) uniqueness implies the conditions of relevance, 
atomicity and continuity (cf. Proposition 5.2.4). 

The uniqueness of the solution (oi, . . . , o;„) of the fixpoint equation (5.6) 
is not clear at all. Imagine the case that the set TZ representing new condi- 
tional knowledge contains twice the same rule in different notations. All that 
can be expected at best is a uniqueness statement for the product aiaj of 
the corresponding factors. Even if we exclude such pathological cases, (5.6) 
is not easy to deal with at all. 

But remember that we are primarily interested in the uniqueness of the 
posterior distribution, not in that of the solutions to (5.6). And indeed, this 
uniqueness is affirmed by the next theorem. In its proof, we will make use of 
cross-entropy as an excellently fitting measure of distance for distributions of 
type (5.5). 

Proposition 5.5.1. There is at most one solution of the adaptation problem 
(*prob) (see page 74) of type (5.5), i.e. the functional concept defined by (5.27) 
satisfies the uniqueness condition. 

The following theorem summarizes our results in characterizing ME- 
adjustment within a conditional-logical framework: 

Theorem 5.5.1 (Main Theorem). Let *. . denote the ME-revision ope- 
rator, that is, *. . assigns to a prior distribution P and some P -consistent 
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set TZ = {(i?i|Ai) . . . , [x„]} of probabilistic conditionals the one 

distribution P. . = P . TZ which has minimal cross-entropy with respect 
to P among all distributions that satisfy TZ. 

Then *. . yields the only adaptation of P to TZ that obeys the principle of 
conditional preservation (PI), realizes a functional concept (P2) and satisfies 
the postulates for logical coherence (PS) and for representation invariance 
(P4). *. . is completely described by (5.5) and (5.6). 



In this way, starting from a conditional-logical point of view we found a 
new characterization of the ME-solution to the problem 

Given a distribution P and a P -consistent set of probabilistic condi- 
tionals TZ, which way is the best for revising P by TZ? 

The thorough embedding of the problem within the conditional-logical fra- 
mework presented here conveys a clear understanding what actually makes 
the ME-distribution to be the best choice - ME-inference and probabilistic 
conditionals fit perfectly well. 

In concluding this chapter, let us summarize the steps we have taken to 
obtain our conditional-logic characterization of ME-inference: The principle 
of conditional preservation was used to determine the structure of the po- 
sterior distribution. Then we assumed that a (continuous) functional concept 
extending classical logic should underly the adjustment process, and we iso- 
lated the crucial parameters which this concept should depend on. It was 
represented by means of two functions F~^ and F~ accomplishing the discri- 
mination between these elementary events satisfying the antecedent of a rule 
that also satisfy its conclusion and those events that do not. So they consti- 
tute the decisive components for the extent of distortion the prior distribution 
is to be exposed to under adjustment. 

Only two further preconditions were necessary to arrive at the desired 
characterization: logical coherence and independence of syntactical represen- 
tation. While the latter postulate is usually considered to be fundamental to 
any reasonable inference procedure, the former one introduces a new aspect 
to reasoning, comparing inferences based on different theories, or epistemic 
states, respectively. This will be pursued in a more general framework later 
on, see Section 7.1. 

Note that no exceptional demands had to be made, and the ME-solution 
arose in a rather natural way. Moreover, the proof of Proposition 5.5.1, which 
states the uniqueness of the solution, illustrates how perfectly well the ap- 
proach presented here realizes ME-inference in an understandable manner, 
without imposing any external and abstract minimality demand. Actually, 
the proper idea of minimality is being made explicit by the four postulates. 
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Now that the ME-principles have been motivated and characterized in detail 
it is time to ask how reasoning at optimum entropy actually works. We will 
throw light on this question from two different sides. 

The first part of this chapter (cf. also [KI98b]) investigates how infering 
at optimum entropy fits the formal framework for nonmonotonic reasoning 
of [Mak94]. In particular, we show that any inference operation based on 
ME-reasoning is cumulative and satisfies loop. Moreover, it turns out to be 
supraclassical with respect to standard probabilistic consequence which ob- 
viously generalizes classical consequence within a probabilistic setting (cf. 
Section 6.1). We also focus on the relationships between nonmonotonic and 
conditional ME-reasoning. Once more, it becomes obvious that material im- 
plication and conditionals differ substantially. To make the differences clear, 
we extend the conditional probabilistic language we are working in so as to 
contain probabilistic formulas corresponding to material implication, too. We 
show that conditionalization in its usual sense relates to material implication, 
whereas the connections between nonmonotonic reasoning and conditionals 
are more complex. 

Though the ME-methods genuinely manipulate knowledge of a numerical 
nature, ME-reasoning is not easily understood by observing the probabilities 
in change. ME-logic is not truth-functional, as fuzzy logic is, nor is its aim 
to raise or to lower probabilities, as in the framework of upper and lower 
probabilities, and there is no straightforward calculation algorithm, as for 
Bayesian networks. ME-infering rather makes use of the intensional struc- 
tures of probabilistic knowledge (cf. [Par94, SJ80, KI98a]), so it seems to 
be better classified and appreciated by describing its formal properties as a 
nonmonotonic inference operation. 

Nevertheless, some examples and practical inference schemes in simple but 
typical cases are important to illustrate ME-inference beyond formal results; 
they will be presented in the second part of this chapter (see also [KI97b]). 
The representation of the ME-distribution central to the argumentation in 
Chapter 5 (see equations (5.5), (5.6) and (5.7), page 76) then turns out to 
be not only of theoretical but also of practical use, allowing us to calculate 
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ME-probability values explicitly. For instance, we will show how knowledge 
is propagated transitively, and we will deal with cautious cut and cautious 
monotonicity. These inference schemes, however, are global, not local, i.e. all 
knowledge available has to be taken into account in their premises to give 
correct results. But they provide useful insights into the practice of ME- 
reasoning. 



6.1 Probabilistic Consequence and Probabilistic 
Inference 

The inference operation Cn of classical logic satisfies the following characte- 
ristic conditions: 

(i) inclusion or reflexivity: X C Cn{X); 

(ii) idempotence: Cn{X) = Cn{Cn{X)); 

(iii) cut: X CY C Cn{X) implies Cn{Y) C Cn{X)] 

(iv) monotonicity: X CY implies Cn(X) C Cn{Y), 

where X,Y C C. 

So Cn is a consequence operation, in the sense of Tarski. Its rigidity, ho- 
wever, appreciated in mathematics and establishing its fundamental meaning 
for logical deduction, makes it a poor candidate to represent commonsense 
reasoning - in particular, monotonicity prohibits defeasible conclusions (cf. 
[DP96]). Nevertheless, many nonstandard logics have elements of classical 
logic built in in one way or another, and Makinson [Mak94] and Kraus, Leh- 
mann, Magidor [KLM90] make use of it as a reference point to formulate 
their principles for nonmonotonic reasoning. 

Within a probabilistic framework, the classical consequence operation has 
to be modified appropriately to provide a suitable tool for the description of 
formal inferences. 

Semantically, Cn may be described by Cn{X) = {A& C \ X \= A}, where 
\= demands taking account of every model of X. So we define the standard 
probabilistic consequence operation Cn — : — >■ by virtue 

of 

C'n....(7^) = {^ G (£ I £)■"■ I P ^ (jifor allF G Mod{U)}, 
with associated standard probabilistic consequence relation \= C x 

n K"‘ (/-iff ^ G Cru...{n), 

If P C (£ I C) is inconsistent, then Mod (TV) = 0, hence Cn — (TZ) = 
{£ I £) for inconsistent TZ. Paris and Vencovska [PV98, Par94] present 
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a sequent calculus for |= . Because standard probabilistic consequence is 

based on considering all models it is a consequence relation in the sense of 
Tarski. 



Proposition 6.1.1. Cn — satisfies inclusion, idempotence, cut and mono- 
tonicity. 

Thus Cn. .. . takes the part of a classical logic within probabilistic logics. In 
general, however, probabilistic models are ill-famed for their extreme flexi- 
bility, so an inference operation based on all possible models is much too 
conservative and will yield little more but quite trivial results. In analogy to 
[Mak94], we define probabilistic inference operations and probabilistic infe- 
rence relations, respectively, to study nonclassical inferences: 

Definition 6.1.1. A probabilistic inference operation is an operation 

^.2(c\cr- ^ 



on sets of probabilistic conditionals. Its associated probabilistic inference re- 
lation is given by Q x (£ | C) , TZ\^ f iff f £ C(7Z). C is 

called complete iff it specifies for each set 7^ C (£ | £) with C{TZ) yf 0 
and C{TZ) yf (£ | £) , a unique distribution, i.e. iff there is a (unique) 

distribution Qn such that C{TZ) = Th^Q-jz). 



For fixed prior P define the ME-inference operation C‘n‘ : — >■ 

2(£|^r- by 

. ( Th{P *. . TZ) iff 7^ is P — consistent 

= \ (£|£)"" else 



where *. . is the ME-revision operator (cf. Section 2.5 and Chapter 5, in par- 
ticular Theorem 5.5.1). Obviously, Cp‘ is a complete probabilistic inference 
operation. The corresponding ME-inference relation is denoted by \^‘p‘ . 



6.2 Basic Properties of the ME-Inference Operation 

In this section, we list some properties of ME-inference which are particularly 
important in the framework of nonmonotonic reasoning and belief revision 
(see Section 2.1, page 13). Most of these properties are already well-known 
(cf. [PV90, SJ81]) or easily proved, respectively (see Appendix). 

One property that has proved to be crucial for characterizing ME- 
inference is that of logical coherence (cf. Section 5.3): 

(P *. . TZ) *. . (TZ U S) = P *. . {TZ U S) 
for any sets 7^, 5 C (£ | £) of probabilistic conditionals. 



( 6 . 2 ) 
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Furthermore, the uniqueness of the solution of type (5.5) yields an easy 
corollary which meets fundamental demands for “good revisions”: 

Proposition 6.2.1. P *. . TZ= P if and only if P \= TZ. 

The next proposition shows that ME-inference satisfies important postulates 
for nonmonotonic inference operations: 

Proposition 6.2.2. The ME-inference operation C‘p‘ satisfies 

1. inclusion: TZ C C‘p‘ {TZ). 

2. idempotence: C‘p‘ {C‘p‘ {TZ)) = C‘p‘ {TZ). 

3. cumulativity: TZ C S C Cp‘ {TZ) implies C‘p‘ {TZ) = Cp‘ {S). 

As can be seen at once, cumulativity is equivalent to 

If P *. . TZ^S then P *. . {TZUS) = P*. . TZ 

which is stated as Principle 5 in [PV90]. 

In the works of Makinson [Mak94] and Kraus, Lehmann and Magidor 
[KLM90], cumulativity takes a central position as a very fundamental pro- 
perty of inference processes: cumulativity ensures the inference process to be 
stable, in that adding of derivable knowledge does not alter the set of nonmo- 
notonic conclusions. As stated above (following [Mak94]), cumulativity is a 
pure condition, without any reference to logical connectives in the underlying 
language and thus being applicable to probabilistic logics in a straightforward 
way. Besides the numerous postulates for nonmonotonic inference relations in 
[Mak94] and [KLM90] that are based on classical structures, there is another 
interesting pure condition called loop (see (2.2) in Section 2.1, page 13): 

Proposition 6.2.3. Cp' satisfies the loop-property: 

IfTZi,..., TZm C (£ I £) with 

TZi+i C‘p‘ {TZi) for 1 ^ z ^ TO — 1 and TZm ^ C‘p‘ {TZi), 

then Cp ‘ {TZi) = C‘p ‘ {TZj) for all i,j = 1, . . . , to. 

ME-inference \^‘p‘ is not transitive, but loop guarantees unambiguity for a 
sequence TZi\^‘p‘ 7^2 hp‘ ... \^‘p‘ TZm\^‘p‘ TZi of nonmonotonic derivati- 
ons. 

In the rest of this section, we will investigate interrelationships between 
ME-inference and standard probabilistic consequence. 
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Proposition 6.2.4. Cp' is supraclassical, that is, Cn — {TZ) C Cp‘ (TZ) 
for allTZC (£ | £)"" . 

The proof of this proposition is straightforward because P . TZ either is a 
model of TZ, or Cp ‘ (TZ) = {C \ C) . 

Supraclassicality implies further important properties: Because C‘p‘ is 
cumulative and supraclassical, it also satisfies full absorption: 

Cn....Cy =Cp‘ =Cp‘ Cn.... 

So in particular (cf. [Mak94]), Cp' fulfills right weakening: 

TZ C C'p' {S) implies Cn — {TZ) C C'p' {S) 

and left logical equivalence: 

if Cn. .. . {TZ) = Cn. .. . {S) then C'p ' {TZ) = C'p ' {S) 

Therefore ME-inference respects standard deductional structures. But ME- 
inference fails distribution: 

C'p ' {TZ) n C'p ' {S) g C'p ' {Cti. .. . {TZ) n Cn. .. . {S)). 

This can easily be seen as follows: Consider the case P = Po{a,b,c), where 
Po{a,b,c) is the uniform distribution over three propositional variables a,b 
and c. Let TZ = {( 6 |a)[a:i], (c|a)[a; 2 ]} and S = {( 6 |a)[a; 2 ], (c|a)[a;i]} with x\ yf 
X 2 - Then {bc\a)[xiX 2 ] G C'p' {TZ) fl C'p' {S) (cf. Proposition 6.4.5 in Section 
6.4), but {bc\a)[xiX 2 ] ^ C'p' {Cn — {TZ) fl Cn — (5)). 



6.3 ME-Logic and Conditionals 

In Chapter 5, it is shown that ME-reasoning may be characterized as a so- 
und and consistent method for handling probabilistic conditionals. Thus the 
notion of conditionals may be considered fundamental to ME-logic, and the 
property of conditionalization 

li{A,}U{B}\^Cthen{A,}[^B ^ C, (6.3) 

as stated in standard, non-probabilistic terms in [Mak94], is of special inte- 
rest. Here — >■ is meant to represent material implication which has no straight- 
forward counterpart in probabilistic logics. Namely, if 4>,ip are probabilistic 
formulas (facts or conditionals), then (/>—>■ cannot be taken as, for instance, 
->(j) V because neither negation nor disjunction are defined in {C\ C) 
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— >■ is also clearly different from the operator (-j-) used for probabilistic con- 
ditionals, not only for syntactical differences. The satisfaction relation \= 
between a distribution P and a conditional (B\A) [x\ actually involves an 
adjustment process, namely shifting P to Pa = P{'\A), whereas material 
implication should be satisfied within P. 

In the sequel, we will investigate conditionalization with respect to a sui- 
tably generalized material implication A ^ B[x] in probabilistics as well as 
with respect to the proper probabilistic conditionals of the form {B\A) [cc]. 

So for the moment, let us extend our language {C\ C) so as to also 
contain all formulas of the type A ^ B[x\: 

(£ I = (£ I £)■"■ U{A^B[x] \A,BgC,xG [0,1]}. 

We modify Cp‘ slightly to take its values in : 



In correspondence to classical material implication, and in compatibility with 
the use of the conditional operator (-j-), we define the semantics of A — >• B[x\ 
in a probabilistic setting as 

P \= A^ B[x] iff P \= A[l] implies P \= B[x] 

for a probabilistic distribution P. Note that P \= A ^ B[x] is not equivalent 
to P 1= yAy B)[x\, i.e. to P{->A\/B) = x. Rather we have \= A — >• B[x] iff 
Mod{A[l]) C Mod{B[x]). So our probabilistic interpretation of — >■ generalizes 
what Adams called strict consequence in [Ada66, p. 274]. In particular, ^ 

A ^ B\x] implies P \= (P|A)[a;] for all distributions P with P{A) ^ 0. 

Using this notion of (probabilistic) material implication, ME-inference 
satisfies conditionalization in the sense of (6.3): 

Proposition 6.3.1. Let P he a distribution. Whenever B[x\ G Cp‘ (77. U 
{A[l]|) then A — >■ B[x] G Cp‘ (77). 

But note that Proposition 6.3.1 is false for probabilistic conditionals: in gene- 
ral, B[x\ G C'p' (77 U {A[l]|) does not imply {B\A) [cc] G Cy (77). Instead, 
we have the following connections between ME-inference and conditional im- 
plication: 

Proposition 6.3.2. Let P be a distribution. 

For any 77 C (£ | £) , (B\A) [x] G (£ | £) , we have 

1. {P*. . {A[1]}) = Pa = P{-\A). 

2. P*. . (77U{A[1]|) = Pa*. • 77, i.e. Cy (77 U {A[l]|) = C},; (77). 
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3. {P *. . TZ) . {^[1]} \= B[x] implies P . TZ \= (B\A) [x]. 

Proposition 6.3.2 shows that P *. . (P U {^[1]}) and (P *. . TZ)*. . {^[1]} 
differ in general, and that only the latter of these distributions reveals a 
behavior that is compatible to conditional implication. 

Proposition 6. 3. 2(2) links up two inference operations based on different 
probability distributions in a special case. Such relationships are of crucial 
meaning for studying iterated belief revision and for investigating belief revi- 
sion and nonmonotonic reasoning in a more general framework (cf. Chapters 
2 and 7). 



6.4 ME-Deduction Rules 

We will now leave the abstract level of argumentation and turn to concrete 
inference patterns. Once more, it must be emphasized, however, that ME- 
infering is a global, not a local method: Only if all knowledge available is 
taken into account, the results of ME-inference are reliable to yield best 
expectation values. Thus it is not possible to use only partial information 
for reasoning, and then continue the process of adjusting from the obtained 
intermediate distribution with the information still left. It is important that 
in the two-step adjustment process {P *TZ\) * {TZ\ UP 2 ) dealt with in the 
coherence postulate (P3) (see Section 5.3) the second adaptation step uses 
full information P 1 UP 2 . In fact, the distributions (P *. . TZi)*. . {TZi U P 2 ) 
and (P *. . TZi) *. . P 2 differ in general. 

For this reason, the deduction rules to be presented in the sequel do not 
provide a convenient (and complete) calculus for ME-reasoning. But they ef- 
fectively illustrate the reasonableness of that technique by calculating expli- 
citly infered probabilities of rules in terms of given (or learned, respectively) 
probabilities. In contrast to this, the inference patterns for deriving lower 
and upper bounds for probabilities presented in [DPT90] and [TGK92] are 
local, but they are afflicted with all problems typical to methods for infering 
intervals, not single values (cf. Chapter 5). 

It must be pointed out that in principle, ME-reasoning is feasible for many 
consistent probabilistic representation and adaptation problems by iterative 
propagation. This is realized, for instance, by the probabilistic expert sy- 
stem shells SPIRIT [RM96] and PIT [SF97] far beyond the scope of the few 
inference patterns given below (also see Example 6.4.3; cf. Chapter 9). 

We will use the following notation: 

TZ : (Ri|4i) [xi], . . . , (R„|4„) [x„] 

{Bl\Al)[xl],...,{BUAl,) [<] 
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mn={{B^\A^)[x^l...,{B^\Ar,)[xn]} andPo*. • Tl\= {{Bl\Al)[x\], . . . , 
where Pq is a uniform distribution of suitable size. 



6.4.1 Chaining Rules 

Proposition 6.4.1 (Transitive Chaining). Suppose a,b,c to be proposi- 
tional variables, xi,X 2 G [0,1]. Then 

n : (5|a)[xi], (c|6)[x2] 

1 (6.4) 

(c|a)[-( 2 xiX 2 + 1 - xi)j 

Example 6.4-1. Suppose the propositional variables a, b, c are given the mea- 
nings a=Being young, b=Being single, and c=Having children, respectively. 
We know (or assume) that young people are usually singles (with probability 
0.9) and that mostly, singles do not have children (with probability 0.85), 
so that TZ = (6|a)[0.9j, (c|6)[0.85j. Using (6.4) with xi = 0.9 and X 2 = 0.85, 
ME-reasoning yields (cjo) [0.815] (the negation of c makes no difference). The- 
refore from the knowledge stated by TZ we may conclude that the probability 
of an individual not having children if (s)he is young is best estimated by 
0.815. 



In many cases, however, rules must not be simply connected transitively 
as in Proposition 6.4.1 because definite exceptions are present. Let us consider 
the famous ’’Tweety the penguin” -example. 

Example 6.4.2. Most birds fly, i.e. {fly\bird)[xi] with a probability x\ bet- 
ween 0.5 and 1, penguins are definitely birds, {bird\penguin)[l], but no one 
has ever seen a flying penguin, so {fly\penguin)[x 2 \ with a probability X 2 
very close to 0. What may be inferred about Tweety who is known to be a 
bird and a penguin? 

The crucial point in this example is that two pieces of evidence apply to 
Tweety, one being more specific than the other. The next proposition shows 
that ME-reasoning is able to cope with categorical specificity. 



Proposition 6.4.2 (Categorical Specificity). Suppose a,b,c to he propo- 
sitional variables, X\,X 2 G [0,1]. Then 



TZ : (61o)[xi], {b\c)[x 2 ], (a|c)[l] 
(6]ac) [x 2 ] 



(6.5) 



Actually, (6.5) is a general probabilistic deduction scheme, not only due 
to ME-reasoning: Specific information dominates more general information, 
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if the specificity relation (a|c)[l] is categorical. This is an easy probabilistic 
calculation. Proposition 6.4.2, however, is proved here by using the a-factors 
of the corresponding ME-distribution to illustrate the interdependencies bet- 
ween the three conditionals. If the probability of the specificity conditional 
(a|c) lies somewhere in between 0 and 1, the equational system determining 
the tti’s becomes more complicated. But it can be solved by iteration, e.g. 
by the aid of SPIRIT (cf. [RM96]), if the conditional probabilities involved 
are numerically specified (cf. Example 6.4.3 below). Within a qualitative pro- 
babilistic context, Adam’s e-semantics [Ada75] presents a method to handle 
exceptions and to take account of subclass specificity. Goldszmidt, Morris 
and Pearl [GMP90] showed how reasoning based on infinitesimal probabili- 
ties may be improved by using ME-principles. 

Example 6.4-3. A knowledge base is to be built up representing “Typically, 
students are adults”, “Usually, adults are employed” and “Mostly, students 
are not employed” with probabilistic degrees of uncertainty 0.99(< 1), 0.8 
and 0.9, respectively. Let a, s, e denote the propositional variables a = Being 
an Adult, s = Being a b'tudent, and e = Being Employed. The quantified con- 
ditional information may be written as 77. = {(a|s)[0.99], (e|a)[0.8], (e|s)[0.9]}. 
From this, SPIRIT (cf. [RM96]) calculates P*{e\as) = 0.8991 « 0.9. So the 
more specific information s dominates a clearly, but not completely. 

Thus ME-inference solves in an elegant way the problem of conflicting 
evidence. Specific information dominates more general knowledge by virtue 
of the inherent mechanisms, without any external preferential or hierarchical 
structures as in [KLM90, Bre89], and without rankings as in [Gef92, GP92]. 
The weight of a rule is encoded by its conditional-logical structure and its 
probability, its interactions with other rules being given implicitly. It is only 
the application of the ME-principle which combines the probabilistic rules to 
yield inferences, thus allowing a convenient modularity of knowledge repre- 
sentation. 

6.4.2 Cautious Monotonicity and Cautious Cut 

Obviously, ME-inference acts nonmonotonically: conjoining the antecedent 
of a conditional with a further literal may alter the probability of the condi- 
tional dramatically (cf. Example 6.4.3). But a weak form of monotonicity is 
reasonable and can indeed be proved: 

Proposition 6.4.3 (Cautious Monotonicity). Suppose a,b,c to be pro- 
positional variables, Xi,X 2 G [0,1]. Then 

77 : (&|g)[a:i], (c|a)[x 2 ] 



(c|a6) [x 2 ] 



( 6 . 6 ) 
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(6.6) illustrates how ME-propagation respects conditional independence 
(cf. [SJ81]): P*{c\ab) = P*(c|a) = X 2 if (5|a)[xi], (c|a)[a:2] is the only available 
knowledge. 

The cautious monotonicity inference rule deals with adding information 
to the antecedent. Another important case arises if literals in the antecedent 
have to be deleted. Of course we cannot expect the classical cut rule to hold. 
But, as in the case of monotonicity, a cautious cut rule may be proved: 



Proposition 6.4.4 (Cautious Cut). Suppose a,b,c to be propositional va- 
riables, Xi,X 2 G [0,1]. Then 



TZ : (c|a6)[a:i], (6|a)[x2] 
(c|a)[^(2xia;2 + 1 - X 2 )] 



(6.7) 



(6.7) is cautious in that the probability of (cja) is a (simple) function of the 
probabilities assigned to (c|a6) and (6|a). By observing the equivalence (6|a) = 
(ab\a), (6.7) may be taken as an immediate consequence of the transitive 
chaining rule (6.4). 



6.4.3 Conjoining Literals in Antecedent and Consequent 

The following deduction schemes deal with various cases of infering proba- 
bilistic conditionals with literals in antecedents and consequents being con- 
joined. Three of them - Conjunction Left, Conjunction Right, (ii) and (Hi) 
- are treated in [TGK92] under similar names, thus allowing a direct com- 
parison of ME-inference to probabilistic local bounds propagation. Cautious 
monotonicity (6.6) may be found in that paper, too, where it is denoted as 
Weak Conjunction Left. We will omit the straightforward proofs. 

Proposition 6.4.5 (Conjunction Right). Suppose a,b,c to be propositio- 
nal variables, xi,X 2 G [0, 1]. Then the following ME-inference rules hold: 

n : {b\a)[xi], (c|a)[x2] 

(y 

{bc\a)[xiX2] 

TZ : (&|a)[a;i], (c|a&)[a:2] 

(ii) 

{bc\a)[xiX2] 

TZ : {b\a)[xi], {c\b)[x 2 ] 

(Hi) 

{bc\a)[xiX2] 

Proposition 6.4.6 (Conjunction Left). Suppose a, b, c to be propositional 
variables, xi,X 2 G [0,1]. Then 
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Tl : (6|g)[a:i], {bc\a)[x2\ 
ic\ab)[^] 



6.4.4 Reasoning by Cases 



The last inference scheme presented in this section will show how probabilistic 
information obtained by considering exclusive cases is being processed at 
maximum entropy: 



Proposition 6.4.7 (Reasoning by cases). Suppose a,b,c to be proposi- 
tional variables, xi,X 2 € [0,1]. Then 



TZ : (c|a6)[a:i], (c|a&)[x2j 



(6|a)[(l 



'(1-Xi) 



(c|a)[a;i(l ■ 



xi'^ii - X2y-^-^ 



X2^{1 — X2Y~^'^ 

■ 3^2(1 ■ 



ViT) 1. 



HI-..)'-)-.] 



'(1 - 



To date, no deduction scheme is known for the interesting antecedent 
conjunction problem 

TZ : (c|g)[a:i], (c|6)[x2j 
(c|a6)[??] 

(cf. [NutSO, SpiOlj). Such a scheme would reveal clearly how ME-inference 
combines evidences. Note that the consistent handling of the antecedent con- 
juntion problem plays a crucial role for characterizing ME-inference (see 
Theorem 5.3.1, page 86, and the remarks following it). 




7. Belief Revision and Nonmonotonic 
Reasoning — Revisited 



Nonmonotonic reasoning and belief revision are closely related in that they 
both deal with reasoning under uncertainty and try to reveal sensible lines of 
reasoning in response to incoming information (cf. Chapter 2, in particular, 
Section 2.3). As we already pointed out, the crucial difference between both 
areas is the role of the knowledge base which is only implicit in nonmonotonic 
reasoning, but explicit and in fact in the focus of interest in belief revision. 
So the correspondences between axioms of belief change and properties of 
nonmonotonic inference operations are usually elaborated only in the case 
that revisions are based on a fixed theory (cf. [MG91]), and very little work 
has been done to incorporate iterated belief revision in that framework. 

Within the context of ME-inference, iterated adjustments arise quite na- 
turally and are dealt with in a satisfactory manner (cf. Equation (6.2) and 
Proposition 6.3.2). The crucial point here is that the ME-operator *me ac- 
tually is a full revision operator taking two entries, namely a distribution 
P on its left and a (compatible and consistent) set of conditionals on its 
right. Nonmonotonic reasoning and belief revision usually focus on handling 
its right entry, while considering its left entry - i.e. the theory inferences are 
based on - to be given. 

In this chapter, we will exploit the relationships between nonmonotonic 
reasoning and belief revision further by considering epistemic states and sets 
of conditionals instead of theories and propositional beliefs. We will provide a 
more general framework that not only allows a more accurate representation 
of belief revision via nonmonotonic formalisms, but also gives, vice versa, 
an important impetus to handle iterated revisions. So we will generalize the 
notion of an inference operation and introduce universal inference operations 
as a suitable counterpart to (full) revision operators in nonmonotonic logics. 

In particular, we will show that the property of logical coherence which 
was identified as one of the axioms for characterizing ME-inference (cf. Sec- 
tion 5.3) may be considered as a strong version of cumulativity for universal 
inference operations. 
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Leaving the classical framework also allows a more accurate view on itera- 
ted revision by differentiating between simultaneous and successive revision. 
The former will be seen to handle genuine revisions appropriately, while the 
latter may also model updating. This distinction is based on clearly separating 
background or generic knowledge from evidential or contextual knowledge, a 
feature that is listed in [DP96] as one of three basic requirements a plausi- 
ble exception-tolerant inference system has to meet. Moreover, we will show 
that in a probabilistic framework, it is also possible to treat revision as dif- 
ferent from focusing without giving up the assumption of having a single, 
distinguished probability distribution as base for inferences. 

Parts of the results presented in this chapter were already published in 
[KIOl] and [KI98b]. 



7.1 Universal Inference Operations 

In Section 6.1, we called a probabilistic inference operation C complete if it 
specifies for each set 7?. C (£ | C) with C{TZ) yf %,C{TZ) yf (£ | C) , 
a unique distribution Qtz, such that C{'1V) = Th{Q-jz) (cf. Definition 6.1.1, 
page 93). That is to say that to each set of probabilistic conditionals yiel- 
ding non-trivial inferences a suitable model is associated by which the cor- 
responding inferences can be described. Hence we assumed the probabilistic 
inference operation C to be model-based (cf. [Her91, Thi89]). 

This definition also makes sense in a more general framework dealing 
with epistemic states if we choose a suitable language (£ | £)*. For in- 
stance, for probability distributions and ordinal conditional functions, we take 
(£ I £)* = (£ I £) and (£ | £) , respectively, and in a purely qualita- 

tive setting, we assume (£ | £)* = (£ | £). In any case, £ is a propositional 
language over an alphabet V. Let £* = £y denote the set of epistemic states 
using (£ I £)* for representation of (conditional) beliefs (see Section 3.1). For 
an epistemic state G £*, we have 

TU(tf^) = {</) G (£ I £)* I IF h 

which is assumed to describe W uniquely, up to representation equivalence (cf. 
the uniqueness assumption (3.4), p. 31). So epistemic states are considered 
as models of sets of conditionals 7^ C (£ | £)*: 

Mod*(TZ) = {W g 8* \ W ^n} 

This allows us to extend semantical entailment to sets of conditionals by 
setting 

7^l h* ^2 iff Mod* {Til) C Mod *(7^2), 
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and to define a (monotonic) consequence operation Cn* : — >■ 

by 

C7n*(7^) = {().G (£|£)* |7^K 

in analogy to classical consequence. Two sets of conditionals 7^i,7^2 C 
(£ I £)* are called (epistemically) equivalent iff Mod*{TZ\) = Mod*(JZ 2 ). 

In the sequel, we will consider (conditional) inference operations 

: 2(£|£)* (7.1) 

associating with each set of conditionals a set of infered conditionals. This 
generalizes the notion of inference operations given in Section 2.1 since pro- 
positional facts may be considered as degenerated conditionals. 

Definition 7.1.1. A conditional inference operation C is called complete iff 
it specifies for each set 7?. C (£ | £)* with C{TZ) ^ 0, (7(7?.) ^ (£ | £)*, a 
complete epistemic state 'I'n, i.e. iff there is an epistemic state such that 

c{n) = Th*{^n)- 

Definition 7.1.2. A universal inference operation C assigns a complete con- 
ditional inference operation 

: 2(^1^)* ^ 2(^1^)* 
to each epistemic state \P € £* : 

C:W^C^. 

C is said to be reflexive (idempotent, cumulative) iff all its involved inference 
operations have the corresponding property. 

If C : S' I— >■ is a universal inference operation, is complete for each 
'!'&£*. That means, for each set 7? C (£ | £)*, Cq,{TZ) is either 0 or (£ | £)*, 
or it specifies completely an epistemic state 

C^{n) = Th*{<P^,n) (7.2) 

Define the set of all such epistemic states by 

£*{C^) = {<Pg£* \ 3TZC{C\C)* : C^{TZ) = Th*{<P)}. 



Definition 7.1.3. A universal inference operation C preserves consistency 
iff for each epistemic state S' G and for each consistent set 7? C (£ | £)*, 
C^{TZ) 7^ 0 and C^{TV) ^ (£ | £)*. In a quantitative setting, when W is 
represented by a conditional valuation function V , we further presuppose Ti- 
to be V -consistent. 
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Here consistency of a set 7^ C (£ | £)* means that there is an epistemic 
state representing TZ. When the (prior) epistemic state is represented by a 
conditional valuation function V, TZ is said to be H-consistent iff there is a 
conditional valuation function V' which is R-consistent and represents TZ (see 
Definitions 4.5.1 and 5.1.1). 

Definition 7.1.4. A universal inference operation C is founded iff for each 
epistemic state 'f' and for any TZ C (£ | £)*, W ^ TZ implies C<p{TZ) = 
Th*{^). 

The property of foundedness establishes a close and intuitive relationship 
between an epistemic state and its associated inference operation C^, di- 
stinguishing W as its stable starting point. In particular, if C is founded then 
C^{^) = Th*{'Tr) (this property is called faithfulness in [KIOl]). 

As to the universal inference operation C, foundedness ensures injectivity, 
as can easily be proved: 

Proposition 7.1.1. If C is founded, then it is injective. 

In standard (i.e. one-dimensional) nonmonotonic reasoning, as it was de- 
veloped in [Mak94] and [KLM90], cumulativity occupies a central and fun- 
damental position, claiming the inferences of a set S that “lies in between” 
another set TZ and its nonmonotonic consequences C{TZ) to coincide with 
C{TZ) (cf. equation (2.1), p. 12). 

To establish a similar well-behavedness of C with respect to epistemic 
states, we introduce suitable relations to compare epistemic states with one 
another. 

Definition 7.1.5. Let C : S' i— >■ C<p he a universal inference operation. For 
each epistemic state W, define a relation on £*{C^) by setting 

Si <I>2 

iff there are sets TZ\ C TZ 2 {C \ C)* such that 

Th*{<Pi) = C^{TZi) and Th*{<p 2 ) = C^{TZ 2 ) 

For founded universal inference operations, we have in particular C'ij/(0) = 
Th*ifF) for all S' G £1*, so S' is a minimal element of £*{Cqf) with respect to 

Proposition 7.1.2. If C is founded, then for all F G £* and for all S G 
£*{C^), it holds that S' S. 
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We will now generalize the notion of cumulativity for universal inference 
relations: 

Definition 7.1.6. A universal inference operation C is called strongly cu- 
mulative iff for each ^ and for any epistemic states ‘Pi,<p 2 G 

<Pi IZ^ <1>2 implies: whenever TZi C TI 2 C (£ | £)* such that Th*{<Pi) = 
C^iTZi) and Th*{<p 2 ) = C^{Tl 2 ), then Th*{<p 2 ) = C^{Tl 2 ) = C<p,{TZ 2 ). 

Strong cumulativity describes a relationship between inference operations 
based on different epistemic states, thus linking up the inference operations 
of C. In the definition above, d>i is an epistemic state intermediate between W 
and <? 2 ) with respect to the relation and strong cumulativity claims that 
the inferences based on d>i coincide with the inferences based on W within 
the scope of < 1 > 2 - 

The next proposition is immediate: 

Proposition 7.1.3. Let C be a universal inference operation which is stron- 
gly cumulative. Suppose W G £* , <P G £*{C^) such that Th*{<P) = C<p{TV), 
C (£ I £)*. Then 

C^{S) = Cg,(5) 

for any 5 C (£ | £)* with TZ C S. 

The following theorem justifies the name “strong cumulativity”: It states 
that strong cumulativity actually generalizes cumulativity for an important 
class of universal inference operations: 

Theorem 7.1.1. If C is founded, then strong cumulativity implies cumula- 
tivity. 

Universal inference operations C : T i-G- C<p are a proper counterpart of 
revision operators * in nonmonotonic reasoning by virtue of setting 

T *TZ = 

(cf. (7.2) above) for T S £* ,1^ C (£ | £)*, that is 

C^{TZ) = Th{<T *n) (7.3) 

Using this notation and the uniqueness assumption (3.4), foundedness means 

= T if 

(cf. the stability postulate (CR2) in Section 4.1). Strong cumulativity is equi- 
valent to 
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<F*{TZUS) = {^*TZ)*{TZUS), (7.4) 

that is, to what was called logical coherence in the framework of probabilistic 
inference (see equation (5.18) in Section 5.3). Equation (6.2) and Proposition 
6.2.1 in Section 6.2 show that ME-inference defines a founded and strongly 
cumulative universal operation : P i— >■ Cp ‘ . Both properties, foun- 

dedness and strong cumulativity, may serve to control iterated revisions. 

It is worth noticing that strong cumulativity, introduced here as a ge- 
neralization of cumulativity for binary probabilistic revision operators and 
satisfied by ME-inference, may be considered as a set-theoretical version of 
postulate (Cl) of the DP-postulates (cf. p. 22) for iterated belief change (cf. 
[DP97a]). 



7.2 Simultaneous and Successive Revision 

Investigating revision in the generalized framework of epistemic states and 
(quantified) conditionals allows a deeper insight into the mechanisms un- 
derlying the revision process. As a crucial difference to propositional belief 
revision, it is possible to distinguish between revising simultaneously and 
successively. In general, we have 

If' * (7^ U 5) yf (If * 7^) * 5 (7.5) 

(this is well-known for ME-inference); instead, we may only postulate strong 
cumulativity or logical coherence, respectively. 

If * (7^ U 5) = (if * 7^) * (7^ U 5), 

which is essentially weaker. The failure revealed in (7.5) is responsible e.g. for 
the unpleasant complexity of ME-reasoning (cf. [Par94]). No cutting down 
to local propagation rules is possible here in general, but, on the other hand, 
we observe a greater variety of revision types. In fact, (7.5) allows us to 
incorporate knowledge on different levels: 

Suppose the (already revised) epistemic state !f *77. reflects our knowledge, 
and we learn the conditionals in S to hold. Three (generally) different ways 
to revise if * 77 by 5 are imaginable: 

— If we decide that 77 and S represent knowledge on the same level, then we 
should accept !f * (77 U 5) as revised epistemic state. 

— Maybe we regard S as successive to 77; then (if * 77) * 5 is supposed to 
reflect the new state of belief. 




7.3 Separating Background from Evidential Knowledge 109 



— A third type of revision arises if one considers S as belonging to S', perhaps 
representing additional background or generic knowledge. Then a suitable 
revision can be performed by calculating * S) * TZ. 

The first of these revision types realizes genuine revision: 'P *TZ is revised 
by learning additional evidential knowledge. The second type is more in the 
sense of updating: We do not expect explicitly the knowledge in TZ to hold any 
longer, rather we concede that some change in the world may have occurred 
so that S now overrides TZ. The third of the revision types above deals with 
a possible change in generic knowledge and raises a new perspective in the 
framework of belief change (see Example 7.5.1). 

Thus in a generalized framework, different types of belief change may 
be realized by making use of one and the same (binary) revision operator, 
or universal inference operation, respectively, in different ways, allowing a 
convenient homogeneity of methods. For doing so, it is necessary, however, to 
represent background (generic, prior) knowledge as separated from evidential 
knowledge pertaining to the given situation (what is often regarded as an 
essential prerequisite for efficient plausible reasoning, cf. [DP96, DP97b]). In 
the following section, we will throw some formal light on belief revision and 
updating when it is possible to distinguish between knowledge on different 
levels. 



7.3 Separating Background from Evidential Knowledge 



Epistemic states provide a convenient, stable and rich representation of kno- 
wledge and may serve as an excellent starting point to perform a belief change 
operation. Yet they have no history, all (uncertain or conditional) knowledge 
is considered on the same level, no distinction is made between explicit and 
implicit knowledge, or between generic and evidential knowledge. Generic 
knowledge may be regarded as constraints imposed on epistemic states, so 
a proper handling is possible by considering sets of epistemic states (cf. e.g. 
[Voo96a]). If, however, one is not willing to give up the convenience of ha- 
ving a single epistemic state as “best” knowledge base, a way to overcome 
the restrictions described above may be offered by taking (properly defined) 
belief bases as primitive representations of epistemic knowledge, from which 
epistemic states may be calculated. 

Definition 7.3.1. A belief base is a pair {'F,TZ), where 'P is an epistemic 
state ^background knowledge^, and TZ C (C \ £)* is a set of (quantified) 
conditionals representing contextual ('or evidential^ knowledge. 
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Usually, evidential knowledge is restricted to certain facts, representing 
knowledge about a present case. This is generalized here by considering the 
evidence 7Z as reflecting knowledge about the context under consideration, 
thus being again of an epistemic nature. Certain knowledge is dealt with as 
a borderline case, but is not of major interest. So fluctuation of knowledge is 
modelled more naturally: Prior knowledge serves as a base for obtaining an 
adequate probabilistic description of the present context which may be used 
again as background knowledge for further change operations. 

The transition from belief bases to epistemic states can be achieved by 
an adequate universal inference operation C, or a binary belief change ope- 
rator *, respectively: 

:=<!' *n with c^{n) = Th*{^ *n) ( 7 . 6 ) 

For this to be well-defined, we have to ensure that both C<p{TZ) yf 0 and 
C,f{TZ) yf (£ I £). In this book, we will only deal with consistent belief change, 
assuming C to preserve consistency and TZ to be consistent with the prior 
knowledge if necessary. Though struggling with inconsistencies is certainly 
a challenging subject, the concentration on handling consistent beliefs in the 
present framework will help to get a clear first view on the topic. 

In the following, we will develop postulates for revising belief bases (iF, TZ) 
by new conditional information S C (£ | £)*, yielding a new belief base 
(<F, TZ) o S, in the sense of the AGM-postulates. 

Due to distinguishing background knowledge from context information, 
we are able to compare the knowledge stored in different belief bases: 

Definition 7.3.2. A pre-ordering U on belief bases is defined by 

c ( 1 ^ 2 , 7^2) tff = '^2 and TZ^ K 

{T^i,TZi) and {' 1 ^ 2 , TZ 2 ) are U-equi valent, 

(<Fi,7^i) =c (5^2, 7^2), 

iffi^i,TZi) C (<F2,7^2) and (•F2,7^2) E 

Therefore (>Fi,7^i) =c {'I' 2 ,TZ 2 ) iff 'f'l = 'f '2 and TZi and TZ 2 are seman- 
tically equivalent, i.e. iff both belief bases reflect the same epistemic (back- 
ground and contextual) knowledge. If the universal inference operation C, 
resp. *, satisfles left logical equivalence (cf. p. 13) with respect to Cn*, then 
{'Tj'i,TZi) =c ( 1 F 2 , TZ 2 ) implies !Fi * 7^i = 1 F 2 * 7^2 • 

The following postulates do not make use of the universal inference ope- 
ration but are to characterize pure belief base revision by the revision opera- 
tor o: 
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Postulates for conditional base revision: 

Let W be an epistemic state, and let TZ,TZi,S C (£ | £)* be sets of conditio- 
nals. 

(CBRl) (S', 7^) o 5 is a belief base. 

(CBR2) If {W,n)oS^ then 7^l h* 5. 

(CBR3) C {^,Tl)oS. 

(CBR4) {'I', TV) o iS is a minimal belief base (with respect to C) among all 
belief bases satisfying (PR1)-(PR3). 

(CBRl) is the most fundamental axiom and coincides with the demand 
for categorical matching (cf. [GR94]). (CBR2) is generally called success: the 
new context information is now represented (up to epistemic equivalence). 
(CBR3) states that revision should preserve prior knowledge. Thus it is 
crucial for revision in contrast to update. Finally, (CBR4) is in the sense of 
informational economy (cf. [Gar88]): No unnecessary changes should occur. 
Admittedly, our postulates are much simpler than those proposed by Hansson 
(see, for instance, [Han89, Han91], and [GR94, p. 61]). They are, however, 
not based upon classical logic. So they are more adequate in the framework 
of general epistemic states. 

The following characterization may be proved easily: 

Theorem 7.3.1. The revision operator o satisfies the axioms (CBRl) - 
(CBR4) iff 

{T,TZ)oS=n{<F,TZUS). (7.7) 

So from (CBR1)-(CBR4), other properties of the revision operator also 
follow in a straightforward manner which are usually found among characte- 
rizing postulates: 

Proposition 7.3.1. Suppose the revision operator o satisfies (T.T). Then it 
fulfills the following properties: 

(i) IfTZ \=* S, then (^F,7^) o5=c ('F,7^); 

(a) //('f'i,7^i) C (iF2,7^2) then {Ti,Tli) o S {T' 2 ,Tl 2 ) o S ; 

(Hi) {{T,TV) o 5i) o S 2 =c {'T,TV) o (5i U 52 ), 

where (iF, 7?.), (iFi, T^i), ( 1 F 2 , 7 ^ 2 ) oltc belief bases and S, 61,62 (T | C)* . 

(i) shows a minimality of change, while (ii) is stated in [Gar88] as a mo- 
notonicity postulate, (iii) deals with the handling of non-conflicting iterated 
revisions. 
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Here we investigate revision merely under the assumption that the new 
information is compatible with what is already known. Belief revision based 
on classical logics is nothing but expansion in this case, and Theorem 7.3.1 
indeed shows that revision of belief bases should reasonably mean expanding 
contextual knowledge. Note that revising a belief base 7?.) by 5 C (£ | £)* 
also induces a change of the corresponding belief state '?'* = 9 * TZ to 
{^*y = 7^) OiS). According to Theorem 7.3.1, if the (underlying) uni- 

versal inference operation C satisfies left logical equivalence, then the only 
reasonable revision operation (as specified by (CBR1)-(CBR4)) is given on 
the belief state level by 



*{{'F,TZ)oS) = <F*{TZUS) 



(7.8) 



and therefore, 

Th*{*{{<F,TZ)oS)) = C^iTZUS) 

This parallels the result for the classical belief revision theory, with the infe- 
rence operation Cxp replacing the classical consequence operation (cf. [Gar88]; 
see also Theorem 2.2.1, p. 14). Nevertheless, we prefer using the more general 
term “revision” to “expansion” here. For if we consider the epistemic states 
generated by the two belief bases ^ *TZ and *{{'P,TZ) o S) = W * (TZUS), we 
see that the epistemic status that 'P *TZ assigns to conditionals occurring in 
S will normally differ from those in iF as well as from those in * (7?. U 5) 
with expanded contextual knowledge. So the belief in the conditionals in S 
is actually revised. 



7.4 Revising Epistemic States by Sets of Conditionals 

Introducing belief bases in Section 7.3 opened up the possibility to perform 
genuine revisions in a clear way, namely by extending evidential or contextual 
knowledge learned about a static world. In general, however, a revision of an 
epistemic state by (sets of) conditionals may also be triggered by information 
referring to changes in the world, thus demanding actually for updating the 
epistemic state. Typical situations for updating occur when knowledge about 
a prior world is to be adapted to more recent information (e.g. a demographic 
model gained from statistical data of past periods should be brushed up by 
new data, see, for instance, the florida murderers-examples 3.5.1 and 4.5.1, 
or the Example 7.5.1 below). 

In the sequel, we will list reasonable postulates for a (general) revision 
of epistemic states by sets of conditionals, matching both the frameworks of 
(genuine) revision and of updating. The postulates partly generalize those 
for revising an epistemic state by a single conditional listed in Section 4.1, 
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page 55. Furthermore, we will compare these postulates to properties of the 
underlying universal inference operation, thus continuing to study the close 
connection between belief change on one hand and nonmonotonic inference 
operations, on the other hand, which has been known already for a couple 
of years (see, for instance, [MG91, GR94]) within the framework of classical 
logics. 

Postulates for revising epistemic states by sets of conditionals: 

(csRi) w*n'^n. 

(CSR2) If IF h then iF * = !f. 

(CSR3) If TZi and 7^2 are equivalent, then W * TZi = * 7^2- 

(CSR4) If S' * 7^1 \=TZ 2 and iF * 7^2 \=T^i then W *TZ\ = * 7^2- 

(CSR5) IF * (7^l U 7^2) = (If * 7^l) * (7^l U 7^2)• 

Postulates (GSRl), (GSR2) and (GSRS) constitute basic properties of 
epistemic belief change. (GSR4) states that two revising procedures with 
respect to sets TZi and TI 2 should result in the same epistemic knowledge base 
if each revision represents the new information of the other. This property 
is called reciprocity in the framework of nonmonotonic logics (cf. [Mak94]) 
and appears as axiom (U6) in the work of Katsuno and Mendelzon [KM91b]. 
(GSR5) is the postulate for logical coherence and deals with iterative revision. 
It demands that at least, updating any intermediate epistemic state W *TZi 
by the full information TZi U 7^2 should result in the same epistemic state 
as revising W by TZi U 7^2 in one step. The rationale behind this axiom is 
that if the information about the new world drops in in parts, updating any 
provisional state of belief by the full information should result unambigously 
in a final belief state. 

Note that in general, the revisions {'P *TZi) *TZ 2 and (lF*7?.i) * (7?-i U7?.2) 
will differ because the first is not supposed to maintain prior contextual infor- 
mation, TZi. As was already mentioned, (GSR5) is a set-theoretical version of 
axiom (Gl) in [DP97a] (see also page 22). (GSR5) has proved to be a crucial 
property for the characterization of ME-inference (cf. Ghapter 5) but actually 
goes back to [SJ81]. 

Postulates for reasonable revisions or updatings, respectively, based on 
inference processes are also proposed in [PV92]. 

For representing revision operations satisfying the postulates stated above, 
we will make use of the relationship between binary revision operators and 
universal inference operations: 

Proposition 7.4.1. Suppose revision is being realized via a universal infe- 
rence operation as in (7.3). 
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(i) * satisfies (CSRl) iff C is reflexive. 

(ii) * satisfies (CSR2) iff C is founded. 

(Hi) * satisfies (CSR3) iff C satisfies left logical equivalence. 

(iv) Assuming reflexivity resp. the validity of (CSRl), * satisfies (CSRf) iff 
C is cumulative. 

(v) Assuming foundedness resp. the validity of (CSR2), * satisfies (CSR5) 
iff C is strongly cumulative. 

The proofs are immediate. From this proposition, a representation result 
follows in a straightforward manner: 

Theorem 7.4.1. If * is defined by (7.3), it satisfies all of the postulates 
(CSR1)-(CSR5) iff the universal inference operation C is reflexive, founded, 
strongly cumulative and satisfies left logical equivalence. 



7.5 Revision versus Updating 

In this section, we will try to get a clearer view on formal parallels and 
differences, respectively, between (genuine) revision and updating. For an 
adequate comparison, we have to observe the changes of belief states that 
are induced by revision of belief bases. Observing (7.6), (CBR2) and (CBR3) 
translate into 

(CBR2’) \=S. 

(CBR3’) *{{I',TZ)oS) ^TZ. 

While (CBR2’) parallels (CSRl), (CBR3’) establishes a crucial difference 
between revision and updating: revision preserves prior knowledge while up- 
dating does not, neither in a classical nor in a generalized framework. 

The intended effects of revision and updating on a belief state 'I'*TZ that is 
generated by a belief base fiR, TZ) are made obvious by - informally! - writing 

(tf'*7^)o5 = !F*(7^U5) 7^ (^F*7^)*5 (7.9) 

(cf. (7.8)). This reveals clearly the difference, but also the relationship bet- 
ween revision and updating: Revising 'F *TZhy S results in the same state of 
belief as updating F by (the full contextual information) TZUS. Note also the 
difference in case that F*TZ |= 5: If * satisfies (CSR2) then (F*TZ)*S — F*TZ, 
but, in general, F * {TZiJ S) will differ from F *TZ. This does not violate the 
cumulativity of *, or of C^, respectively, because S is not supposed to include 
TZ (if it does, then (strong) cumulativity yields equality). Rather it reveals 
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clearly the distinction between having S to be incorporated as explicit con- 
straints and having S only satisfied in a revised epistemic state (cf. [GP96, 
page 88]). 

The representation of an epistemic state by a belief base, however, is not 
unique, different belief bases may generate the same belief state (the same 
holds for classical belief bases, cf. [Han89], [GR94, p. 48]). So we could not 
define genuine epistemic revision on belief states, but had to consider belief 
bases in order to separate background and context knowledge unambigously. 
It is interesting to observe, however, that strong cumulativity, together with 
foundedness, ensures at least a convenient independence of revision from 
background knowledge: If two belief bases ('f'l , 7^) , (i ?2 > 7^) with different prior 
knowledge but the same contextual knowledge give rise to the same belief 
state 

S'"! * 7?. = ^2 * 7?., 

then - assuming strong cumulativity and foundedness - 

>P'i * {TZ U 5) = ('f'l * TZ) * {TZ U S) 

= ('f'2 * TZ) * {TZ U S) 

= 'f'2 * {TZ U 5). 

So strong cumulativity and foundedness guarantee a particular well-behaved- 
ness with respect to inference, updating and revision. 

In the following example, we will illustrate revision and updating in a pro- 
babilistic environment, using ME-inference as the proper universal inference 
operation. 

Example 7.5.1. A psychologist has been working with addicted people for a 
couple of years. His experiences concerning the propositions 

V : a : addicted to alcohol 
d : addicted to drugs 
y : being young 

may be summarized by the following distribution P that expresses his belief 
state probabilistically (where negation is indicated by barring the correspon- 
ding letter): 
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The following probabilistic conditionals may be entailed from P: 

(d|a) [0.242] (i.e. P(d|a) = 0.242) 

(dja) [0.666] (i.e. P(djd) = 0.666) 

(o]y) [0.246] (a]y) [0.660] 

(d]y) [0.662] (d|y) [0.251] 

Now the psychologist is going to change his job: He will be working in 
a clinic for people addicted only to alcohol and/or drugs. He is told that 
the percentage of persons addicted to alcohol, but also addicted to drugs, is 
higher than usual and may be estimated by 40 %. 

So the information the psychologist has about the “new world” is repre- 
sented by 

P= {aVd[l],(d]a)[0.4]}. 

The distribution P from above is now supposed to represent background or 
prior knowledge, respectively. So the psychologist revises or updates, respec- 
tively, P by P using ME-inference and obtains P* = P . TZ as new belief 
state: 
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After having spent a couple of days in the new clinic, the psychologist 
realized that this clinic was for young people only. So he had to revise ge- 
nuinely his knowledge about his new sphere of activity and arrived at the 
revised belief state *. . ((P,P) oy[l]) = P*. . (PUj/[l]) =: P* shown in the 
following table: 

(7.10) 



7.6 Focusing in a Probabilistic Framework 

Focusing means applying generic knowledge to a reference class appropriate 
to describe the context of interest (cf. [DP97b, DP96]). In a probabilistic 
setting, focusing is best done by conditioning which, however, is used for 
revision, too. So revision and focusing are supposed to coincide in the fra- 
mework of Bayesian probabilities though they differ conceptually: Revision 
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is not only applying knowledge, but means incorporating a new constraint so 
as to refine knowledge. 

Dubois and Prade argued that the assumption of having a uniquely deter- 
mined probability distribution to represent the available knowledge at best 
is responsible for that flaw, and they recommend to use upper and lower 
probabilities to permit a proper distinction (cf. [DP96, DP97b]). 

Making use of ME-inference *. . , however, it is indeed possible to realize 
this conceptual difference appropriately without giving up the convenience 
of having a single distribution for inferences. To make this clear, we have 
to consider belief changes induced by some certain information A[l], that 
is, we learn proposition A with certainty. The following proposition reveals 
the difference between revision by A[l], as realized according to (7.8), and 
focusing to A by conditioning (cf. also Proposition 6.3.2 in Section 6.3). 

Proposition 7.6.1. Let P he a distribution, 7?. C (£ | £) a P -consistent 
set of probabilistic conditionals, and suppose A[l] to be a certain probabilistic 
fact. 

(z) P*. . {A[1]} = P{-\A); 

in particular, {P *. . TZ) . A[l] = (P *. . TZ)f\A). 

(ii) *. . ((P,P)o{A[l]}) = P*. . (PU{A[1]}) = P(-|A) *. . n. 

Both parts of this proposition may be proved by using the representation 
(5.5) of the ME-distribution. 

Example 7.6.1 (continued). The distribution obtained in (7.10) by revision 
is different from that one the psychologist would have obtained by focusing 
his knowledge represented by P* = P *. . P on a young patient, which is 
given by P* *. . {y[l]} = P*{-\y) =: P^: 
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Proposition 7.6.1 and Examples 7.5.1 and 7.6.1 show that, in a (gene- 
ralized) probabilistic framework, a proper distinction between focusing and 
revision is possible. This difference is akin to the one between “conditioning” 
and “constraining” elaborated by Voorbraak [Voo96a] for classes of probabi- 
lity functions (for a criticism of conditioning sets of probability measures, cf. 
[GH98]). Paris and Vencovska [PV92] also consider focusing by using uncer- 
tain information in the context of various inference processes. 





8. Knowledge Discovery by Following 
Conditional Structures 



In many cases, knowledge bases for expert systems consist of rules, i.e., of 
conditional statements. In the previous chapters, we investigated in detail the 
formal properties of conditionals, how to represent them appropriately and 
how to handle them under change of beliefs. Solving these problems is a ne- 
cessary prerequisite to arrive at a satisfactory representation and processing 
of knowledge. When designing an expert system, however, at first one has to 
face another crucial problem: Where do all the rules come from? How to find 
a set of rules representing relevant knowledge in an exhaustive way? Besi- 
des human expertise, also experimental data may be available. Incorporating 
the detailed experiences of an expert into the knowledge base usually is an 
indispensible task in knowledge acquisition. Extracting and providing infor- 
mation from databases, however, may essentially help to support, automate 
and improve this process. 

Data mining and knowledge discovery, respectively, mean finding new 
and relevant information in databases. Usually, knowledge discovering is 
understood as the more comprehensive task, including preparing and clea- 
ning the available data and interpreting the results revealed by the actual 
data mining process, aiming at discovering interesting patterns in data (cf. 
[FPSS96, FPSSU96, FU+96]). 

In this chapter, we will focus on this central part of knowledge discovery 
within a probabilistic framework, where we assume experimental data to 
be represented by a probability distribution. This means that we will deal 
with relatively “small” data mining problems with respect to the number of 
variables or propositions involved. By using clustering techniques (see, for 
instance, [AGGR98]) and considering LEG-networks as an appropriate tool 
to split up large probability distributions in a system of local distributions 
(see Ghapter 9), however, the problem of discovering relevant relationships 
amongst variables can be reduced to mining manageable distributions. 

Relationships amongst variables and sets of variables may be expressed by 
association rules (cf. [AIS93, MS98, AMS+96, SA95, Bol96], and see below) 
which are a special kind of probabilistic conditionals. Relevance of such rules 
is usually measured by considering their confidence, which is nothing but a 
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conditional probability, and their support, which is the number of cases in the 
database that a rule is based upon (cf. [AIS93]). These are certainly plausible 
indicators for a rule to interest the user. If, for instance, the manager of a 
supermarket wants to improve the layout of his store, he will be interested 
in knowing how many customers buying product A also buy product B, and 
which percentage of total sales those transactions constitute. 

When designing the knowledge base of an expert system, however, rele- 
vance of rules depends on the representation and inference methods used. 
Within a probabilistic framework, knowledge of conditional independences is 
of particular importance. Moreover, when using ME-methods, we would best 
find a set of conditionals that represents the distribution under consideration 
by means of ME-propagation. The corresponding rules will be considered not 
only technically relevant, but also relevant in a fundamental, information- 
theoretical sense. By observing that ME-inference obeys the principle of con- 
ditional preservation (cf. Section 5.1 and Section 4.5), we have to search for a 
set of conditionals with respect to which a given distribution is indifferent (cf. 
Definition 4.5.1). An approach to solve this problem is developed in Section 
8.2 and constitutes the main contribution of this chapter. We will illustrate 
this method in several examples. In particular, we will show how it may help 
to find a suitable ME-optimal set of rules. 

We will start with recalling some results from general probabilistic kno- 
wledge discovery. 



8.1 Probabilistic Knowledge Discovery 

Mining statistical databases roughly serves three purposes: Firstly, one is 
interested in finding relevant association rules , i.e. expressions of the form 
A ^ B where A,B are disjoint subsets of an item set X (cf. [AIS93]). The 
database yields a relative frequency distribution Pr over I, and the support 
and the confidence, respectively, of such a rule A ^ B is simply defined as 
Pr{AU B) and Pr{B\A), respectively. Effective algorithms are available to 
find significant association rules even in large databases (see [AIS93, MS98, 
AMS+96, SA95, Bol96]). 

As a second task, statistical databases and probability distributions, re- 
spectively, are investigated in search of causal structures which can be re- 
presented as Markov graphs, or as directed acyclic graphs (see [SGS93]). The 
notion of conditional independence is fundamental to those techniques, and 
an important application is the discovery of Bayesian networks from data 
(see, for instance, [SGS93, Jen96, Hec96]). 
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Thirdly, if the number of variables is not too large, one may consider 
the resulting distribution as representing important inherent relationships 
between propositional variables in a more logical sense and search for inte- 
resting conditional rules revealing these relationships. This approach may be 
considered as dual to the second one: Here dependencies, not independen- 
cies, are to be discovered. This is often done for building up the knowledge 
base of an expert system. Here a special type of conditional, the so-called 
single- elementary conditional, is of particular interest, since it is supposed to 
pinpoint relevant information: 

Definition 8.1.1. A conditional {B\A) is called a single-elementary condi- 
tional if the antecedent A is an elementary conjunction, i.e. a conjunction of 
literals, and the conclusion B is a single literal, and if neither AB = T nor 
AB = T. The basic conditional = {form{uf)\form{u!,uj')) (cf. Defini- 
tion 3. 4- 2) is a basic single-elementary conditional ifu) and to' are neighboring 
worlds, i.e. interpretations differing with respect to exactly one atom. 

Traditionally, single-elementary conditionals are appreciated for represen- 
ting knowledge very clearly and intelligibly. Thus they occupy an outstan- 
ding position for knowledge representation and reasoning. Association rules 
are similar to single-elementary conditionals. There are, however, structural 
differences: Antecedent and conclusion of an association rule may be consi- 
dered as elementary conjunctions with only positive literals. The conclusion 
of a single-elementary conditional consists of only one literal, i.e., it contains 
exactly one item. Basic single-elementary conditionals are single-elementary 
conditionals with antecedents of maximal length. 

Usually, single-elementary conditionals are considered important if the 
corresponding probability is significantly high, that is, near to 1 within a 
certain distance e ^ 0. But high probability alone does not suffice to make a 
conditional really relevant: If P{B\A) > 1 — e then there will normally be a lot 
of other variables V such that P{B\Av) > 1 — e, too. So the problem of dis- 
covering relevant single-elementary conditionals may be conceived as finding 
single-elementary conditionals where the antecedent is as short as possible 
(with respect to the number of occurring literals) . This problem is dealt with 
by the author and others in [KIR96, Ger97], and [Sch98] ; moreover, [Sch98] 
also searches for exceptions to such shortest single-elementary conditionals. 
In both cases, parts of the computed conditionals were used for probabilistic 
knowledge representation via ME-inference (cf. [RKI97b]). These sets of con- 
ditionals, however, were not optimally suitable because no specific feature 
of ME-methods was taken into consideration in the discovery algorithms. 
In particular, ME-propagation yields a distribution which is indifferent with 
respect to the set of learned conditionals (see Section 5.1). So, as a neces- 
sary condition for an appropriate set TZ of conditionals ME-representing a 




122 8. Knowledge Discovery by Following Conditional Structures 



given distribution P, we may claim that P be indifferent with respect to TZ 
(see Section 3.6.1). An approach to solve this problem is developed in the 
next section within a more general setting, considering conditional valuation 
functions instead of probability distributions. 



8.2 Discovering Conditional Structnres 

Given some conditional valuation function V : C {A, ©,©,0-^, 1-^), two 
questions arise at once: 

— What knowledge does V represent? What are the propositional and the 
conditional beliefs held in VI 

— Which subset of these conditionals (including facts) is distinguished in the 
sense that V is in accordance with the conditional structures it imposes on 
possible worlds? 

The first question means answering queries (B\A) [x\,x =?, by calculating 
V{B\A). The second question amounts to finding a set 7^ C (£ | £)^*^ such 
that K is a c-representation with respect to TZ. Ideally, we would have V to 
be a faithful c-representation, i.e. we are searching for a set 7?. C (£ | £)^*^ 
such that V \= TZ and ker V = ker a-jz, or kero V = kero o’n, respectively. 
Assuming faithfulness means presupposing that no equation V (23) = is 
fulfilled accidentally, but that any of these equations is induced by TZ (cf. also 
the Faithfulness condition in [SGS93, pp. 35f.]). 

At the end of Section 3.5, we made some first considerations concerning 
this generally complicated and expensive task. 

In this section, as the main result of this chapter, we will present an 
approach to computing sets of conditionals that underly the knowledge re- 
presented by some conditional valuation function V : 73 — >■ A. As a crucial 
prerequisite, we will assume that this knowledge is representable by a set of 
single- elementary conditionals. This restriction to single-elementary conditio- 
nals is important, but should not be considered as a heavy drawback bearing 
in mind the expressibility of these conditionals. 

Suppose 7?. C (£ I £) is an existing, but unknown set of single- elementary 
conditionals, such that ker an = ker V, and ker V is known. In the following, 
we will present a method for determining or approximating, respectively, sets 
5 C (£ I £) such that ker V = ker a$. 

Each conditional in TZ is presupposed to be single-elementary, so we set 

7^ = {(&l|Al),...,(6„|A„)} (8.1) 
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where Ai are elementary conjunctions and bi are literals, 1 ^ i ^ n. Without 
loss of generality, only to simplify notation, we assume all literals bi to be 
positive (cf. Lemma 3.5.2). 

Let 

a-jz : [2 T']z = , . . . , ajj), a„ ) 

denote a conditional structure homomorphism with respect to TZ (cf. Equa- 
tion (3.19) in Section 3.5, page 42). 

For each atom v G C, choose an arbitrary, but fixed numbering of the re- 
maining atoms {w \ w ^ v} = (wq, wi, . . . iW^(atoms)-i)- Then basic single- 
elementary conditionals are conditionals of the form 

i’v,! = {v \ /\w"/) (8.2) 

3 

with Cj G {0, 1}, wj := Wj,w^ := vJJ, 0 Gi j Gi ^{atoms) — 1 and I = 

We will abbreviate the antecedent of by C^x- 

Cv,i:=/\w;A l = Y^e,V (8.3) 

3 3 

(the numbering Wj depends on r?). Let 

B = I V atom in £, 0 < Z < - 1} 

denote the set of all basic single-elementary conditionals in (£ | £), and let 

Ts = (b+,,b-; I V atom in £, 0 < / < - 1) 

be the free abelian group corresponding to B (cf. Section 3.5) with conditional 
structure homomorphism 



ctb — : 

V ,l 


: 12 — >■ iFg, 


(8.4) 




W = CyjV 




= { Ki 


OJ = CvgV 


(8.5) 


[ 1 


else 





Lemma 8.2.1. as is injective, i.e. ker ub = {!}. 

So (Tg provides the most finely grained conditional structure on 12: No 
different elements u5i yf LO 2 are equivalent with respect to B. 

We define a homomorphism 
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g\ 



via 

9iK,i) = n 9iK,i) = n 

l<i^n l<t<n 

where C is defined according to Definition 3.4.1 in Section 3.4, p. 36. Note 
that we presupposed all conditionals in TZ to have posititve literals in the 
consequent. Otherwise, the definition of g has to be altered appropriately, 
but the results to be presented in the sequel will still hold. The prerequisite 
of dealing with single-elementary conditionals, however, is essential for the 
following. The next lemma provides an easy, but far-reaching characterization 
for the relation C to hold between single-elementary conditionals: 



Lemma 8.2.2. Let (b\A) and (d\C) be two single- elementary eonditionals. 
Then 



(d\C)mb\A) iff C if A andb= d (8.7) 



Remark 8.2.1. The preceding lemma may be slightly generalized to hold for 
conditionals {b\A) and (d\C) where A and C are disjunctions of elementary 
conjunctions not containing b resp. d. 



Using Lemma 8.2.2, we have 



9(k,i) = n = n 

It is important to note that for different atoms v and v', only different 
af occur in g{hi^ i) and g{hi ^, ;,), respectively, by Lemma 8.2.2 (analogically 
for a~ and g{hf and g{hf , ;,), respectively). Moreover, each af and a~ , 
respectively, occurs at most once in each g(b)J";) and gfbfi), respectively. 
This will be used several times in the sequel. 

g establishes a connection between the conditional structures with respect 
to B and to - the still unknown, but existing - TZ: 

Theorem 8.2.1. Let g : Ts — >■ Tn be defined as in 8.6. Then a-jz = g o <Jb- 

The property of all conditionals in TZ to be single-elementary is crucial 
for the proof of this theorem. Only in this case it is guaranteed that each af 
or a~ , respectively, occurring in ctt^(w), also occurs exactly once in go asiu)). 

Theorem 8.2.1 provides immediately a method for determining ker g by 
considering erg and ker an which is assumed to be known: 
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Corollary 8.2.1. w G ker a-jz ijf G kerg. 

Proposition 8.2.1. Let co = • . . . • co'^ G f2. 

Then G ker g iff for all atoms v in L, 

n n iKiT^^kerg. (8.9) 

l^fc^m l^fc^m 

So each (generating) element of ker an gives rise to an equation modulo 
ker g for the generators of iFg. 

Lemma 8.2.3. Let v be an atom of the language C. 

e kerg or ^ kerg iff for all {b^\A,) G TZ, 

1 ^ i ^ n, such that v = bi it holds that 'f2k-Cv i <Ai '^k = 0 - 

This lemma shows a complete symmetry between the generators b^j and 
b~j occurring in elements of ker g (which is also obvious by the definition 
of g). So the superscripts may be omitted if not explicitly needed. Formally, 
let denote the quotient of b^^ and b'^: 

b+ 

K,i = ^ ( 8 . 10 ) 

Ki 



Then the following corollary holds: 

Corollary 8.2.2. Let v be an atom of the language C. 

n ^ ker g iff (b-;J’’'= G 



iff n e^kerg 



The idea of the procedure to be described in the sequel is to explore the 
relations mod ker g holding between the group elements b„_; G iFs with the 
aim to define a finite sequence of sets ... of conditionals such that 

kerugio) C kerugm C ... C ker an (8-11) 

Thus the sequence . . . tries to approximate TZ. We will first present 

the fundamental idea of the method and develop the necessary theoretical 
results in the rest of this section. In the next section, the procedure will be 
explained and illustrated in detail by examples. 
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Step 0: 

We start with setting 

= B (8.12) 



Lemma 8.2.1 states ker agm = 1, so (8.11) trivially holds. Let =g denote 
the equivalence relation mod ker g on iFg, i.e. bi =g b 2 iff g(bi) = g(b 2 ) 
for any two elements bi,b 2 G iFg, where g is defined as in (8.6). For each 
(generating) element Q = • . . . • of ker an, set up an equation modulo 

ker g: 

ctb ( w ) =g 1 , 

this means, according to (3.21), 



1 = HK;)' 















(8.13) 



and split up these equations according to Proposition 8.2.1 and Corollary 

8 . 2 . 2 . 



Step 1: 

First, eliminate from B — all ipyj for which there is an equation b„_/ =g 1: 

5*-^^ = 5*-°^ — {'ipv,! G 5*'°^ I hy^i =g 1 is known} 

The equations modulo ker g further partition into equivalence classes 
[b„,;]g = {hy^i> G 5^^) I hyj =g b„_// is known}, b„_/ G (only b^_p 

with the same v occur in [b„ /]g, according to Lemma 8.2.2). For each such 
equivalence class [b^_;]g, set 

Dvi = V 

and 

= {V I = U i’v,!', 

^v,l' ^ 

where jy = 1, 2, ... is a proper finite numbering. Now we set 

= {vi]],, I ^ atom,}„ = 1 , 2 , . . .} (8.14) 

Define homomorphisms 

: T 5 ( 0 ) — >■ T 5 ( 1 ) and g^^^ : T 5 ( 1 ) Tn 



via 
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if b?!./ — n 1 



,(i) 



if i>v,i E T’. 



( 1 ) 



n 



v = bi,D 



( 1 ) 



(8.15) 

(8.16) 



Note that we omit the superscripts + and — for simplicity of notation. This 
is in accordance with (8.10), because and b~; are dealt with in a comple- 
tely symmetrical way, and (8.15) and (8.15) also hold for the corresponding 
quotients, is well-defined, because each b^^; 1 is contained in exactly 

one equivalence class [b„_j/]g and thus b^^j C i^^^^-^for exactly one 

So /lE) models the transition from 5*'°^ to 5^), and (/E) relates to 

J^TZ as g does for The following lemma shows that a similar equation as 
given in Theorem 8.2.1 still holds: 



Lemma 8.2.4. Let be defined as above. Then the following 

relationships hold: 

(i) g = g«o/j(i). 

(a) 0 - 5 ( 1 ) = /lE) o (75(0) . 

(Hi) O 0-5(1) = ctt?,. 



Corollary 8.2.3. her agio) C her 0 ^( 1 ) C her an- 

This first step usually reduces B considerably and shows the general pat- 
tern of modifying the set of conditionals under consideration by defining 
appropriate homomorphisms /i^) and g^^\ respectively. This will be pursued 
in the next step, too. As an important difference to Step 1, however, we will 
no longer deal with basic single-elementary conditionals. More general, 5^^ 
is a set of conditionals with a single atom v in the conclusion, and 

the antecedent of is a disjunction of elementary conjunctions not 

containing v. 

Due to Lemma 8.2.4(i), we have 

9{ n (VlD = 1 iff gW( J] /,W(b„,)J’''=) = l 

l^k^m l^k^m 



Thus by replacing each b„ )j, by its image ft,Ei(b„ ;^), we will now explore 
equivalence mod her (/E) between the generators of .^^5(1). Note that 
while neither TZ nor g are known, the homomorphisms h^^l will approximate 
7?. in a constructive way. 
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Step 2: 

Step 1 already revealed the basic idea of the method to be presented: (Single- 
elementary) Conditionals are joined by U in accordance with the equations 
modulo ker g. The second step also follows this idea for more complex equa- 
tions. First we specify its starting point: 



Prerequisites: Suppose 5*^*^ is a set of conditionals with a single atom v in 
the conclusion, and the antecedent of is a disjunction of elementary 

conjunctions not containing v. Let Tg(t) = ^ be the free abelian 

group associated with and let — 1- Tn be the homomorphism 

defined by 

<,'■'(=») = n 

1’ v,3^ 1 



such that 



O ag(t) 



— ^n- 



Let =g(t) mean = modker 



In this step, we exploit equations of the form 



(t) _ 

^V.in — ® 



it) 



. . . S 



it) 



to modify 5*-*^ appropriately. To obtain this modified set 

1. eliminate from 

2. replace each by 
d*+i) 






Jt) 



ip' : ' = ip' U ip''"'- = (v I V ), 

rv,jk ^v,jo rv,jk : I v,Jo 






for 1 < fc < TO. Set V 1 < A: < to; 

3. retain all other ip^^i, i.e. 



(8.17) 



This also includes the case to = 0, i.e. =g (0 1; in this case, (2) is 
vacuous and therefore is left out. 

Define homomorphisms — >■ iFs(t+i) and gd+i) : ^^^(t+i) — >■ 

by 






rii^fcs; m 






e(^+l) 

V,jk 



S 



(^+1) 

w,l 



i{w = v,l= jo 

iiw = v,l=jkA^k^m 

else 
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and 

= n 

We now prove a statement equivalent to Lemma 8.2.4 for this general 
case: 

Lemma 8.2.5. Let be defined as above. Then the follo- 

wing relationships hold: 

(i) o 

(a) o a SO) = cTso+i) ■ 

(Hi) o (J 5 (t+i) = a-iz- 

So the new set is apt to continue the set chain in the sense of (8.11): 

Corollary 8.2.4. With the same notation as in Lemma 8.2.5, it holds that 

kercfso) Q keruso+i) Q ker an (8.18) 

By replacing each group element by 1)> equations holding 

modulo ker g'dl are transformed into equations modulo ker 

k k 

due to Lemma 8.2.5(i). 

By repeating step 2, the original equations modulo ker g are modified and 
solved, if possible, defining a sequence of sets 5^*^ of conditionals such that 



ker as(o) C . . . ker aso) C ker aso+i) C . . . C ker an, 



as desired, together with homomorphisms describing their relationship 
to TZ. 

Suppose that no further reduction of equations modulo ker g^*'^ according 
to step 2 is possible, and the procedure halts. So we arrive at a set S^*'> of 
conditionals (p) ) with a single atom v in the conclusion, and the antecedent 

D( j of ip(j is a disjunction of elementary conjunctions not containing v. Let 

•^S(*) = (®i*] )v,j be the free abelian group associated with S^*\ and 

let g^*'> : iFsM be the homomorphism defined by 




v = b,- 
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such that 



g'' > o agM 



= <^n- 



Now, one of the following two situations may occur: 



— Either, there are still non-trivial equations modulo ker In this case, 
si*) is only an approximation of ker a-jz, and ker is included in it. 

— Or, all equations modulo ker g could be solved successfully, so no non- 
trivial equations modulo ker g^*'> are left. That is, for any to € ker an, 
1 = an{>^) = 9^*^ o (T 5 (»)(w) holds trivially, i.e. due to ag(,){i2}) = 1. But 
this means ker an Q ker agM C ker an, so 



ker ag(,) = ker an 

and the procedure ends up with full success. 

In general, the techniques described in steps 1 and 2 will not suffice to 
eliminate all equations modulo ker g, and we will be left with more complex 
equations modulo ker g^^) of the form 

"»<•> < 8 ’«) 

k I 

all rk,si > 0. The great variety of relationships possibly holding between 
the conditionals involved makes it difficult, if not impossible in general, to 
construct a new appropriate set of conditionals in a straightforward 

way. 

Nevertheless, the method developed so far already illustrates the central 
idea of how to find the conditionals whose structures a conditional valuation 
function V follows: By investigating relations between the numerical values 
of V, the effects of conditionals are analyzed and isolated, and conditionals 
are joined suitably so as to fit the conditional structures inherent to V. The 
operations on conditionals are based on equations between group elements 
representing these conditionals. So the formal framework for conditionals 
developed in Chapter 3 once again proved useful, providing the possibility of 
calculating relevant conditionals from e.g. probability distributions. We will 
illustrate how this works by considering examples in the next section. 

Though at the present state, the method is not guaranteed to terminate 
successfully, we will find that in many cases, it will yield a useful approxima- 
tion of the unknown set TZ of conditionals. Treating equations of form (8.19) 
is a topic of our ongoing research, and results will be published in a further 
paper. 
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8.3 Examples — ME-Knowledge Discovery 

We will now illustrate the method described in the previous section by two 
probabilistic examples. Given a probability distribution P, we will show how 
to calculate a set S (or S , respectively) of (probabilistic) conditionals 
such that P = Pq . S , where *. . is the ME-operator and Pq is a 
suitable uniform distribution. That is, we are going to solve what is known 
as the inverse maxent problem. 

Due to the fact that P is necessarily indifferent with respect to such a set 
S, analyzing the relationships between the numerical values in P will help to 
find such an S, as is explained in the previous section and as will be carried out 
in the following. Note that by assuming P to be a faithful representation of 
some set TZ , i.e. P(w) = 1 iff (Ttz{oj) = 1, we have P(w) = 1 iff ctb(w) =g 1, 
according to Corollary 8.2.1. 

We consider formulas involving the three atomic propositions a, &, c, in- 
terpreted by 

a being a student 
b being young 

c being single (i.e. unmarried) 

in two different settings, represented by two distributions. We list the twelve 
basic single-elementary conditionals of B: 



V'a.o=(a 


be) 


V'b,0 = (^ 1 


ac) 


V’c.o=(c 1 


db) 


i’a,i={a 


be) 


V'b,l = (^ 1 


de) 


V’c,i=(c 1 


db) 


i’a,2 = {a 


be) 


'fpb,2={b 1 


ac) 


•0c,2=(c 1 


ab) 


V'a.3 = (a 


be) 


ipb,3={b 1 


ac) 


■0c.3=(c 1 


ab) 



with corresponding generators ^ of Pg. 

Example 8.3.1. The first distribution Pi over a,b,c is given as follows: 



to 


Pi (a;) 


CO 


Pi (a;) 


abc 


0.3028 


dbc 


0.2133 


abc 


0.0336 


dbc 


0.0237 


abc 


0.0421 


dbc 


0.1712 


abc 


0.0421 


dbc 


0.1712 



By calculating ratios of probabilities of neighboring worlds, we observe im- 
mediately 

Pi(dbc) = Pi(dbc), Pi{abc) = Pi{abc), Pi = Pi > 
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and we will now show how these numerical relationships can be used to 
calculate a set S of conditionals that may impose such a structure on P. 



So let 



^ / abc abc abc ■ abc 

^ \ abc ’ abc ’ abc ■ abc 



Step 0: We start with 5^°^ = B; the generators of Ki give rise to the following 
equations modulo ker g: 



1 =g erg 
1 =g CTB 
1 =g OB 



abc^ 




_ ba,lbb,ib + o 


abc 




^a,0*^b,0^c,0 


abc^ 




b+ib-3b+. 


abc 




^a,0^b,2^c,2 


abc • 


a&c^ 




abc • 


abc ) 


“ b+3b+,by3-b-3b+,b+i 



Considering these equations for each atom a, b, c separately and omitting the 
{+, — }-signs, we obtain 



g ba,2 
g bf,,2 
g bc,l 

(cf. Proposition 8.2.1 and Corollary 8.2.2). 



ba,l =g bo, 0 ) 


III 

<3 


bfcp =g bf,, 0 , 


bfc,3 = 


bc ,0 —g be , 2 I 5 


cr 

CO 

III 



Step 1 : We eliminate the basic single-elementary conditionals il)cp and ipc ,2 
from = B, and join conditionals according to the equations above; we 
obtain as conditionals in 5*^^^ 



i>a,0 LI V’a.l 


=: ‘Pal 


= ia\b) 


V'a,2 U V-a.a 


='■ pII 


= (a|&) 


ipb,0 u i>b,l 


='■ Pbl 


= (^|a) 


^6,2 U tpb,3 


='■ Pbl 


= (^|a) 


V'c,! U V-c.B 


='■ pII 


= (c|&) 



These are all conditionals in with corresponding elements G 
All of the equations modulo ker g set up in step 0 are transformed into trivial 
equations modulo ker g^^'> . Calculating the probabilities of these conditionals 
in Pi, we obtain 

Pi(a|&)«0.2; Pi(a|6)«0.6; 

Pi(6|a)«0.4; Pi(6|a) « 0.8; 

Pi{c\b) « 0.9. 
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By using an ME-tool (like SPIRIT, cf. [RKI97b]), we see that actually 

Pi = Po*. • {(a|6)[0.2],(a|6)[0.6],(&|a)[0.4],(&|a)[0.8],(c|6)[0.9]}, 

and these conditionals represent knowledge incorporated by P which is rele- 
vant, in particular, with respect to ME-inference. 



Example 8.3.2. The second distribution P 2 over o, 6, c is given as follows: 



cu 


P2(w) 


CO 


P2(W) 


abc 


0.1950 


abc 


0.1528 


abc 


0.1758 


abc 


0.1378 


abc 


0.0408 


abc 


0.1081 


abc 


0.0519 


abc 


0.1378 



Here important relationships between probabilities are revealed by 



P2{abc) 



P2{abc), P 2 







so that 



^ ' abc abc-abc abc-abc\ 

abc ’ abc ■ abc ’ abc ■ abc / 



Again we start with 5*-°^ = B. The generators of K 2 yield the following 
equations modulo her g: 



1 =g OB 
1 =g OB 

1 =g OB 



abc\ 




abc J 


^a,0*^b,0^c,0 


abc ■ abc \ 




abc ■ abc J 


“ b+2b^2b“3-b-3b^ib+, 


abc ■ abc \ 


b^^]^b(^^3b^^2‘ba,obb,obc,o 


abc ■ abc J 


ba,ob(),2bc,2‘ba,ib5^ib^^Q 



Considering these equations for each atom a, 6, c separately and omitting the 
{-k, — }-signs, we obtain 



ba,0 


=9 


ba,l =g bo, 2 =g bo, 3 


bc,o 


=9 


bc,l =g be, 2 =g bc,3 


b&,o 


= 9 


1 


b&,3 


= 9 


bfc,ibf,,2 



(cf. Proposition 8.2.1 and Corollary 8.2.2). 
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Eliminating and joining conditionals according to steps 1 and 2, we obtain 
52 = {(a|T),(c|T),(&|a),(&|c)} 

and 

5-"- = {(a|T)[0.4635],(c|T)[0.4967],(6|a)[0.8],(6|c)[0.7]} 
represents P 2 via ME-inference, i.e. P 2 = Pq • ^ 2 '“ ■ 



8.4 Open Problems and Concluding Remarks 



The approach to ME-optimal knowledge discovery developed in this chapter 
makes an important new contribution to the field; nevertheless, a lot of work 
remains to be done: 



1. How are the crucial relationships inherent to a given P found? It seems 
advisable to investigate the orders of magnitudes of the probabilities in- 
stead of precise rational numbers. In many cases, considering ratios 
with neighboring worlds u, u' will help to find important relationships. 



2. The sets of conditionals discovered in the examples 8.3.1 and 8.3.2 are 
not really ME-optimal because they contain redundant conditionals: It 
is straightforward to check that P\ = Pq *. . {(6|a)[0.8], (c|&)[0.9]| and 
P 2 = Pq . {(6|a)[0.8], (6|c)[0.7]}. Eliminating redundant conditionals 
from the resulting set of conditionals is still an open problem. 

3. Last not least, more complex equations of the type (8.19) still have to be 
dealt with. 



We are, however, optimistic in that the method presented here may be ex- 
tended to also treat the more difficult equations in problem 3, and that it can 
be modified appropriately to yield an even more effective algorithm, tailor- 
made to ME-propagation and avoiding redundant conditionals, thus solving 
problem 2. 

The applicability of the method presented in this chapter neither depends 
on the presupposition of V being a faithful c-respresentation nor on having 
a complete description of ker V available: Each numerical relationship found 
amongst the values of V corresponds to an element of ker V and may be used 
to set up equations for the group elements in Pjg modulo ker g. The generators 
of ker V are particularly appropriate for this task, in that they yield basic 
equations, but any other element will do, too. If V fails to be a faithful c- 
representation of some suitable set of conditionals, then too many equations 
modulo ker g will have to be solved trivially. In this case, backtracking will 
be necessary, undoing the last joining of conditionals. 
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Moreover, the techniques of Section 8.2 can also be applied when a prior 
epistemic state has to be taken into account. For instance in a probabili- 
stic framework, when P is the actual distribution to be investigated and 
P' is some prior distribution, we can compute a set 7?. C (£ | £) with 
P' . TZ = P in the same way as above, namely by applying the algorithm 
to the normalized function P/P' . 




9. Algorithms and Implementations 



In this chapter, we present a selection of various computational approaches 
to quantified uncertain reasoning and probabilistic knowledge discovery. 



9.1 ME-Reasoning with Probabilistic Conditionals 



Probability theory provides a powerful and mathematically founded, non- 
heuristic framework for uncertain reasoning, but, due to their exponential 
complexity, probability distributions are not easily dealt with. Efficient algo- 
rithms are necessary to represent probabilistic dependencies between a large 
number of variables or atoms, respectively, and to incorporate new informa- 
tion so as to achieve a revised or instantiated probabilistic state of belief. 

To reduce the complexity of probability distributions, ME-reasoning can 
make use of so-called LEG-networks, where LEG stands for local event 
group (cf. [Lem82, Lem83]). LEG-networks are hypergraphs with its hy- 
peredges (leg’s) consisting of sets of atomic propositions (or events, or 
propositional variables, respectively). To each LEG, a component marginal 
distribution is associated. Like the clique trees of Bayesian networks (cf. 
[LS88, Nea90, Jen96]), they allow local computations and propagations of 
probabilities. 

The expert system shell SPIRIT^ uses LEG-networks for representing 
sets of probabilistic rules and reasoning via the principle of maximum en- 
tropy (cf. [RM96, RKI97a, RKI97b, Mey98]). Given a set of probabilistic 
conditionals TZ, a hypergraph is constructed whose hyperedges consist exac- 
tly of variables occurring in one conditional in TZ, respectively. Usually, this 
hypergraph fails to be acyclic, so a covering hypertree is generated which 
allows a decomposition of the associated ME-distribution P* = Pq *me TZ 
(cf. [Lem83, Mal89]). Learning of the conditionals in TZ is done locally on 
the leg’s by approximating iteratively the Lagrange factors at (cf. (5.5), p. 
76) which yield a potential representation of P*. The global propagation is 

^ available at http://www.fernuni-hagen.de/BWLOR/forsch.htm 
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then carried out by successively adjusting neighboring LEG’s by a procedure 
similar to iterative proportional scaling (cf. [Lau82, MR92]), passing through 
the structure of the hypertree. Answering queries is done by modifying the 
hypertree appropriately and propagating the values of the instantiated varia- 
bles (or atoms, respectively). It is worth noticing that the Lagrange factors 
a which are so meaningful for the theoretical results presented in this book 
are also of crucial importance for an efficient computation. For a detailed 
description of the algorithm, cf. [Mey98] . 

So LEG-networks provide an efficient method to perform local computa- 
tions for ME-reasoning, reducing its complexity (cf. [Par94]). Their meaning 
is similar to that of Bayesian networks for probabilistic reasoning in gene- 
ral. There are, however, crucial differences between Bayesian reasoning and 
ME-reasoning: 

— Instead of imposing external assumptions of (conditional) independencies 
on the variables, ME-reasoning follows the internal structure of conditionals 
to install independencies intensionally: Independence is only assumed when 
no other information is available. 

— ME-reasoning does not require the specification of large amounts of con- 
ditional probabilities to build up a Bayesian network. It rather fills up 
the necessary values in an information-theoretically optimal manner. In- 
stead, it offers the possibility of specifying knowledge by conditionals in an 
intuitive way, requiring only to list and quantify relationships which are 
considered relevant by the user. 

Thus, ME-reasoning seems to combine ideally sound probabilistic reasoning 
with intuitive knowledge representation, providing a powerful machinery to 
realize commonsense and expert reasoning in demanding domains like medical 
diagnosis. Another ME-system, LEXMED^, is already used to support physi- 
cians in diagnosing appendicitis in a German hospital (cf. [SE99]). LEXMED 
is based on the system shell PIT ([FS96, SF97]). PIT not only accepts precise 
probabilities, but also allows one to specify intervals of probability values for 
the conditionals. An approach to combine ME-reasoning with probabilistic 
logic programming techniques is presented in [LKI99]. 



9.2 Probabilistic Knowledge Discovery 

Within the field of probabilistic knowledge discovery, conditionals of a sim- 
ple syntax have proved to be of particular importance. These conditionals 

^ Homepage of LEXMED: http://lexmed.fh-weingarten.de 
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usually have conjunctions of atoms or literals in their antecedents and conse- 
quents, avoiding disjunctions in order to pinpoint relevant information. Asso- 
ciation rules (cf. [AIS93]) only make use of positive literals, whereas single- 
elementary conditionals have only one (positive) literal as its consequence 
(cf. Definition 8.1.1). 

In [KIR96], we dealt with discovering relevant and significant single- 
elementary rules in a given probability distribution P. There, the significance 
of a probabilistic rule is measured simply by its probability with respect to a 
threshold e: {B\A) is called (e)-signficant if P{B\A) > 1 — e. And relevance 
aims at presenting relationships in a concise way, that is, by shortening the 
(conjunctive) antecedent of a rule without giving up significance. To find 
significant single-elementary conditionals, one has to check for each elemen- 
tary conjunction the corresponding ratios of probabilities to its neighboring 
conjunctions. To introduce the notion of relevance, we gave a criterion which 
elementary conjunctions should be investigated to bring forth rules with a 
particular short antecedent. The level of significance e can be chosen by the 
user, so as to allow investigations on different levels of abstraction. 

The algorithm presented in [KIR96] starts with “long” rules, successively 
shortening the elementary conjunctions under consideration. In contrast to 
this, the algorithm in [Sch98] begins with short conjunctions, extending the 
antecedents of rules in search of exceptions. The implemented program in 
[Sch98] also checks the quality of the set of discovered probabilistic rules by 
measuring the information-theoretical distance between the corresponding 
ME-distribution and the original distribution P. In [Miil98], the idea of a 
structural interestingness of probabilistic rules is discussed, and a program 
to read data from databases and to find association rules was implemented. 



9.3 Possibilistic Belief Revision 

Another approach to realize quantified uncertain reasoning is made in [Hoc99] 
by means of possibilistic logic: Instead of assigning one degree of uncertainty 
to each proposition, as in probabilistic logic, possibilistic logic allows one to 
specify the epistemic attitude towards a proposition by two values, a degree 
of necessity and a degree of possibility which are usually assumed to range 
within the unit interval (cf. [DLP94]). Possibility and necessity measures are 
both determined by possibility distributions tt : 17 — >■ [0, 1]. Possibilistic logic 
aims at capturing qualitative epistemic relationships between propositions. 
Indeed, possibility distributions are very similar to ordinal conditional func- 
tions (cf. [DP94]) ~ in particular, the degree of possibility of a disjunction is 
the maximum of the degrees of possibility of the disjuncts. 
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In [Hoc99] , a program was implemented that computes a representation of 
a set of propositional formulas, each equipped with a degree of necessity, by a 
possibility distribution. The resulting knowledge base can be modified via the 
belief change operations of expansion, revision and contraction (cf. Section 
2.2). In possibility theory, a deduction theorem similar to that in classical 
logics holds, allowing the program to derive (new) possibilistic knowledge by 
making use of a possibilistic resolution calculus. As a special feature of pos- 
sibilistic logic, however, the program is capable of tolerating inconsistencies, 
and to take the degree of inconsistency of a knowledge base into regard when 
deriving possibilistic information. All the theoretical background for possibi- 
listic deduction and possibilistic change operations is explained in [Hoc99] in 
detail. 




10. Conclusion 



In this book, a profound and extensive investigation of how to handle uncer- 
tain knowledge both in qualitative and in quantitative settings was presented. 
The unifying framework announced in its title is provided by conditionals. 
Observing conditional structures and preserving conditional beliefs turned 
out to be of crucial importance when reasoning under uncertainty and chan- 
ging epistemic knowledge bases. 

The approach to conditionals developed and used here is quite different 
from the logical one usually taken. Rather we featured a dynamic view on 
conditionals as actors on worlds, shifting them appropriately to establish con- 
ditional beliefs. In this context, an essential notion was that of a conditional 
valuation function allowing us to conceive conditional reasoning actually as 
extending propositional reasoning by a new dimension. Conditional valuation 
functions abstract from the concrete representation of epistemic attitutes (e.g. 
by probability distributions or by ordinal conditional functions) and provide 
the formal framework to reveal fundamental patterns of (quantified) conditio- 
nal reasoning. In particular, the idea of preserving conditional beliefs under 
change operations could be put in precise formal terms, capturing interac- 
tions of high complexity between atoms and relating them to the (sets of) 
conditionals inducing change. It was shown that this principle of conditio- 
nal preservation not only has a crucial meaning for quantitative conditional 
reasoning, but also covers corresponding ideas and approaches in qualitative 
settings. Therefore, it can be considered to formalize an important paradigm 
in general epistemic rescuing. 

Despite the complicatedness of the underlying theory, revisions of proba- 
bility distributions and ordinal conditional functions preserving conditional 
beliefs turned out to follow a strikingly simple conditional design. This ap- 
proach was pursued within the probabilistic framework to elaborate further 
conditions such a revision should reasonably satisfy. We found that crucial 
topics in this area were the realization of a functional concept depending on 
conditional as well as on numerical structures, the independence of syntac- 
tical representation of probabilistic knowledge and the coherent handling of 
iterated revisions {logical coherence) . Together with the principle of conditio- 
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nal preservation, the corresponding three postulates were apt to distinguish a 
single probabilistic revision operator, namely revision by optimizing entropy 
(ME-revision). Thus we characterized this powerful technique of incorpora- 
ting new probabilistic information by focussing on the consistent handling of 
conditionals, establishing the ME-methods as a most important tool to carry 
out quantified uncertain reasoning. We classified ME-reasoning according to 
the formal framework of nonmonotonic reasoning, proving it to satisfy crucial 
demands in this area. Moreover, we showed how to use the formal represen- 
tation of the ME-distribution to derive some inference patterns, that is, how 
to calculate ME-probabilities in some simple, but illustrative examples. 

From general epistemic belief revision to probabilistic change operations 
and back again - ME-revision should not only be looked upon as an inte- 
resting but very special example of realizing probabilistic revision ideas. Rat- 
her its formal investigation revealed fundamental revision mechanisms apt to 
influence the whole area of epistemic belief revision. We gave a precise forma- 
lization of the principle of conditional preservation first for ME-inference in 
[KI98a], later transferred it to a qualitative setting (cf. [KI99b, KI99a]), and 
then extended it here for conditional valuation functions in general. Moreo- 
ver, the postulate for logical coherence, first discovered as a property of ME- 
inference in [SJ81], turned out to be of essential importance to handle iterated 
revisions. The complexity of ME-reasoning illustrates how to distinguish bet- 
ween simultaneous and successive revisions, how to take background know- 
ledge into regard and how to realize different belief change operations such as 
(genuine) revision, updating and focusing, by using one and the same change 
operation in different ways. For nonmonotonic reasoning, it proves to be an 
elegant example of how to incorporate dependence on an underlying theory or 
epistemic state, respectively, and in particular, how to generalize the central 
notion of cumulativity to also compare inferences based on different epistemic 
states. 

The mathematical background used to formalize the principle of conditio- 
nal preservation also payed out when managing the task of discovering kno- 
wledge: Conditional valuation functions following the structures imposed by 
some underlying set of conditionals necessarily satisfy certain numerical rela- 
tions. So to discover such a set of conditionals, numerical relationships, e.g. in 
a probability distribution, can be exploited and transferred into equations on 
some basic conditionals. Solving these equations means to join conditionals 
appropriately and thereby induces operations on conditionals, with the aim 
to construct a set of conditionals adequately representing relevant conditio- 
nal information incorporated in the valuation function under consideration. 
We illustrated how this method may be used to discover an ME-optimal set 
of conditionals within a probability distribution. This clearly exceeds the fin- 
ding of isolated association rules since structures of knowledge are revealed. 
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Such structures are essential for displaying complex relationships between 
variables, and they are necessary for hypothetical reasoning or for simulation 
tasks. 

The definitions and techniques introduced for conditionals here present a 
completely new approach to handle conditionals in nonmonotonic reasoning 
and belief revision. They not only provide a rich methodological framework 
to establish and prove the central results of this book, but are also of interest 
for themselves. 




A. Proofs 



Proofs of Chapter 3 



Proof of Lemma 3.4.5: If {B\A) U {D\C) G {C \ C) exists, then both 

{B\A) and {D\C) are subconditionals of it. Conversely, if {B\A) and (D\C) 
are subconditionals of {F \ E), then (B\A)^ U {D\C)'^ C [F \ if)+ and 
{B\A)~ {D\C)~ C (F I E)~ . Therefore these two sets are disjoint, and so 

the supremum exists. □ 



Proof of Lemma 3.4.10: E {D\C) iff cc |= CD and co' ^ CD] in 

particular, form {co, to') ^ C in this case. 

{D\C)A.{B\A) iff Mod{C) is included in exactly one of Mod{AB), 
Mod{AB), Mod{A). So for all C {D\C) if (£i|C)_lL(F|^). 

Conversely, suppose for all E {B>\C). Let wi \= 

CD, UJ 2 1= CD. Then E {D\C), and, due to the presupposition, 

Wi,W 2 G -M) where M is one of Mod{AB), Mod{AB), Mod{B). Let oj \= C. 
Then w |= CD or w ^ CD. If a; |= CD, then E {D\C), so II (B\A) 
and therefore oj G M, too. If w \= CD, then oj G M follows by considering 
V’oji.w Thus C < M, which means (L>|C')_1L(S|^). □ 

Proof of Lemma 3.5.1: Let a : fl ^ T = (aj'^,ajj, . . . ,a+,a“) be 
a map from the set of worlds to the free abelian group generated by 
aj^,aj", . . . ,a+,a“ such that <t(u;) contains at most one of each af or a~ . 
For each i, 1 ^ i ^ n define formulas Ai, Bi such that 

Mod{Ai) := {oj G D \ af or a~ occurs in uj} 

Mod{Bi) := {oj G D \ af occurs in oj} 

Set TZ = {(Bi\Ai) I 1 < t < n, Mod{Ai), Mod(Bi) yf 0}. It is straightforward 
to check that a-jz = a. □ 
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Proof of Lemma 3.5.3: According to (3.23), it holds that 

fcerCT7^u{(T|T)} = feercr-T?, n fcercT(T|T) 

= {w G fcer cr-R I ct(t|t)(w) = 1} 

Because (T|T) is confirmed by all worlds w G 17, we have (T(-r|-r)(w) = a+ for 
all w G 17, so 

= (a+)^?=i’'^' 

Therefore G ker iff ''’i = which means 

fcer<T(T|T) = l7o- This proves the lemma. □ 



Proof of Proposition 3.5.1: Proof of (1): Let {D\C) C {B\A), where 
B is one of B,B. Then C < A, by Lemma 3.4.2. Let w = uj[^ G 

ker (J^J)\c)AC, thus tOk\= C for all 1 < A: < m, and hence (i7|C')(wfe) G {0, 1}, 
using notation (3.1). If {D\C) Q (,B\A) then {D\C){uJk) = {B\A){uJk), and if 
{D\C) C {B\A) then (I?|C')(a;fe) = 1 — (i?|A)(u;fc), for all 1 < fc < m. Anyway, 
Wfc h A and 1 = (T(i5|c)(w) implies Y.k:(D\C)(u,^) = l = Ek:{D\C)M= 0 ^k = 

0, due to (3.21). But^then Efc:(B|A)(.;,) = l = Efc:(S|A )(.;,)=0 too- 

and therefore (J(b|a)(w) = 1. 

Conversely, assume that {D\C) is neither a subconditional of {B\A) nor 
of {B\A), and let C ^ A. This means that also CD, CD ^ A, and that 
there are worlds wi,W 2 G 17 such that (i7|C)(wi) = (Zl|C)(u; 2 ) ^ u, but 

ijJ-i 

(i?|A)(u;i) yf {B\A){ijJ 2 )- Then wi,W 2 H C' and — G ker cfidW) C C, but 

W2 

— i kera(B\A)- 
U>2 

Proof of (2): Let (I?|C')_1L(B|A), i.e. C < AB,AB or A, respectively, 
according to Lemma 3.4.9. Thus CT(b|a)(<^) is the same for all co 1= C. Due to 
cancellations, C fl 17o C fcer CT(b|a)- 

Conversely, suppose (D|C)_lL(i?|A) does not hold. Then there are wi, W 2 \= 

C such that (JrB\A){^i) ^ cr(BU)(w 2 ), i-e. o-(bu) ( — ) 1. So — G CnCo, 

\L02 J W2 

LO\, 

but — ^ keru(^B\A)- 
UJ2 

Proof of Lemma 3.5.4: E {B\A) iff a; G {B\A)^ ,lo' G {B\A)~ . The- 
refore ct(b|a)(-^) = — yf 1. □ 

cj a 
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Proof of Corollary 3.5.1: Let — G ker a-n — ( | ker ai^B\A)i thus 

Uj' 

{B\A)en 

<^(B\A ){ — ) = 1 for all {B\A) G TZ. The assertion follows by Lemma 3.5.4. □ 

U)' 

Proof of Lemma 3.6.1: Let wi,a ;2 G fi such that <T 7 ^(a;i) = (Ttz{'^ 2 )- If 
V{uji) = 0-^ then there is {B\A) G TZ with CT(b|. 4 )(<^i) 1? and ( 77 ^ (wi) = 

implies in particular = cr(s|^)(a; 2 ), and hence V(uj 2 ) = 0 '^, 

too, by condition (i) of Definition 3. 6 . 1(1). 

Now suppose V {oji) ,V {L 02 ) 7 ^ 0-^, i.e. wi,W 2 G 17+. Moreover, we have 

— G l7o) so due to the presupposition (T 7 ?,(wi) = ct'ji{u] 2 ), we obtain 
U>2 

V{toi) = V{u! 2 ), by condition (ii) of Definition 3.6.1, (1) and (2). □ 



Proof of Proposition 3.6.1: Suppose V : £ Ais a conditional valuation 
function which is strictly indifferent with respect to 7Z. Let u) G ker a-ji H 
i.e. CT 7 ^(o 5 ) = 1 = aTi{eQ), where en is the empty word in Q. Because V is 
indifferent with respect to TZ, we obtain V{u) = V{eo) = 1, so a) G kerV. 

Conversely, let P : £ — >■ M be a conditional valuation function such that 
condition (i) of Definition 3.6.1 holds and ker a-jz fl 17+ C ker V. Suppose 
= ^Tz{^ 2 ) for Wi,W 2 G 17+. Then ■ ^ 2 ^) ~ 1> f®- ^1 ’ ^ 2 ^ ^ 

ker (T7?,ni7+ C ker V. This implies = 1, and thus M(a)i) = V ( 0 ) 2 ) • 

Therefore V is strictly indifferent with respect to TZ. 

The second part of the proposition follows from the first one by observing 
Lemma 3.5.3. □ 



Proof of Theorem 3.6.1: Let P be a probability function and TZ = 

. . . , (Bn\An)} be a set of conditionals. Suppose first that P is stric- 
tly indifferent with respect to TZ. Then P{Ai) yf 0, due to the prerequisite in 
Definition 3.6.1. The equivalence relation = 7 ^ induces a partition l7i, . . . , I7g 
of 17 in disjoint classes so that, according to Lemma 3.6.1, P(w) is constant 
on each equivalence class. Assume P{oj) = Pj for w G I7y. Let wi, . . . , Wg G 17 
be a representative system of 17i, . . . , 17,. 

For the sake of simplicity of notation, we suppose that pi, . . . ,Pg> > 
0, Pq’+i = . . . = Pq = 0 with q' ^ q. 

For all P{uJj) = Pj = Q,q' <j^q, there is {Bi.\Ai.) G TZ such 
that a[Bi.\Ai.){'^j) ^ 1 and P{to') = 0 for all ui' with a(Bi.\Ai.){‘-^') = 
o-(b,.|.4,.)(w7). If (^{Bi.\A^.){ojj) = a+ then set «+ = 0 and a" = 1; if 
ct(b. 1^. )(wj) = a“ then set af.=l and a“ = 0. Without loss of gene- 
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rality, assume that those conditionals G TZ are the conditionals 

{Bi\Ai), n' <i^n. 

Let us now consider the constants pj ^ 0. Finding positive factors 
ctn^ctn' with 0 P{<^) = 0 11 '^7 amounts to 

= l cu|=AjB7 

solving the following system of q' equations 

n n «i" i (A.l) 

Isgiijn' 

cuj|=AjB7 

which can be transformed into a linear equational system 

6>/3 = A (A.2) 

with /3 = (loga]^, log aj", ... ,loga+,, logo”, )'^ G A = 

(logpi, . . . ,logpg/)^ G R* (where M denotes the field of real numbers) 
and a, q' X 2nLmatrix O with elements in {0,1}) dj, 2 i = 1 iff ai{ujj) = af, 
Sj, 2 i+i = 1 iff = a~ for 1 < j < g', 1 < t ^ n'. Let Oj, 1 < j < q', 

denote the rows of 0. The equational system (A.2) is solvable over R 
iff any linear dependencies (over the field of rationals, because each 
entry of 0 is either 0 or 1) between these rows correspond to relati- 
ons between the Xj = logpj, i.e. = J2r‘^ri,0ni must imply 

J2k "^nik^nik = SniXni with rationals r^k^Sni- 

Arranging and multiplying both sums appropriately, we may 
assume J2k'^rnk(^mk = J2i^nidni with natural numbers r^nkySn,- 
By comparing the vector components, we obtain = 

y} / ^ni^ni, 2 i^ j 2 ^+i — ^ni^ni^ 2 i+i 7 1 ^ i ^ 7T . These equa- 
tions imply Efe:a,(o)„,J=a+ = Ei:a,(o)„,)=a+Sn, and Efc:a,(^„)J=ar 

= J2i-.cTi{uini)=a~ ■ Therefore the elements and 0; are 

7?.-equivalent by equation (3.22) on page 45, and because P is assumed to 
be a strictly indifferent with respect to P, we obtain 

k k I I 

Applying the logarithm function now yields 

^ ' ^rrifc ^rrifc = ^ ' ^ni Xni t 

k I 

as desired. Thus the equational system (A.2), or (A.l), respectively, 
is solvable, yielding a solution /3 = (/3{', /3(", . . . , /3“,)^ G R^” . 
Setting a)*" = exp(/3)*") and a~ = exp(/3“), 1 ^ t ^ n', we obtain 

p{^)= n n Oj for P{u)) 7=- 0- Taking now also into account 

uj\=AiBl 
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the conditionals (i3„/+i|A„/+i), . . . , (i?„|A„), belonging to P{uJj) = 0, we 
thus have P{uj) = H n for all w G 17 because the non-zero 

uj\=AiBi uj\=A^'B^ 

factors belonging to those conditionals are 1. 

To prove the converse assume P{uj) — o^t n is a probabi- 

uj\=A^Bi lu\=A^'B:[ 

lity distribution with a^, aj", . . . , a+, a“ G R+. We have to show the strict 
indifference of P with respect to R. 

If P{to) = 0 then there is (Bi\Ai) G TZ such that co ^ AiBi and af = 0, 
or w 1= AiBi and a~ = 0. So, in any case ai{uj) yf 1 and P{uj') = 0 for any 
w' G 17 with ai{co') = ai{uj). This shows condition (i) of Definition 3.6.1. 

Now consider two 7?.-equivalent elements 

o)i = and ^2 = ^ 

k i 

with identical conditional structures 



CT7?,(wi) = <JTz{'^iy‘ = an{^2)- 



Then Y. = Z) ^^nd Y '^k = Y ^lold for 

k:(Ti(uJk)=af l:<Ti(iyi)=af k:ai(uik)=a~ l:rJi(vi)=aY 

all z = 1, . . . , n according to equation (3.22) on page 45. Checking condition 
(ii) of Definition 3.6.1 is now an easy calculation: 

P(u5i) = P(wi)^C..P(a;„J’'-i = 



„ E rfc Y. _rk 

= n n 

l^i^n l^i^n 

„ T.S, Y _ SI 

= n n 

l^i^n l^i^n 



= P(zZl)^C..P(zz„J^-2 =P(S2). 



□ 



Proof of Corollary 3.6.1: Let P : C ^ (M+, -f, •, 0, 1) be a probability 
function. Following Theorem 3.6.1, P is strictly indifferent with respect to 
a set 7?. = {(i?i|Ai), . . . , (i?„|7l„)} iff P{Ai) y 0 for all z, 1 < z < rz, and if 
there are real numbers aY cti,--., a+, a“ G ffi.’*" such that, for all w G 17, 

P(w) = n n 

u>|=A^Bi w|=A.jB7 
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Define P : Tti — >■ (R^,-) by P{&t) '■= af , P{a^ ) := , 1 ^ i < n. Then 

for w € 17, 



P o a-jz{i^) 



p{ n n ) 

uj\=AiBi uj\=A^Bi^ 

n ^(a+) n 

n « n « ■ 

u|=AjBj 

Pico) 



Conversely, if P : P-ji — >■ (M’’", •) is a homomorphism satisfying Poa-jz = P, 
then by setting P{af) =: af , P{a~) =: a~ , 1 ^ i ^ n, we obtain a repre- 
sentation of P as in (3.25), and so P is strictly indifferent with respect to P. □ 



Proofs of Chapter 4 



Proof of Theorem 4.1.1: Let A,B,C,D G C. 

Suppose C ^ B OT C ^ B. Then, aceording to Lemma 3.4.9, (D|C)_1L(S|T). 
(CR3) and (CR5) now imply (Cl) and (C2). 

(C3) and (C4) are direct consequences of (CR6) and (CR7) by using that 
(B\A) C (R|T) and {B\A) C (R|T), respectively, due to Lemma 3.4.2. □ 



Proof of Theorem 4.2.1: By observing that ( formAuj)\ forTn,(aj.uj')) II {B\A) 
for all w,w' G Mod{AB) {Mod{AB), Mod{A), respectively), (CR5) immedia- 
tely implies equation (4.1), due to Corollary 2.4.1. 

Conversely, if {P>\C) is a conditional with (D|C)_IL(R|^), then all uj G 
min(CD; <f"), w' G min(C'il; !?') lie in the same set Mod{AB), or Mod{AB), 
or Mod {A), respectively. (CR5) now follows from (4.1) by applying Lemma 
2.4.2 and Lemma 4.2.1. □ 



Proof of Proposition 4.2.1: Let C be one of AB, AB, A, respectively. 
Then Bel{{'P * {B\A)) * C) |= D iff if' * (B\A) \= (D\C), according to (RT). 
For D G £,we have {D\C)A.{B\A), due to the prerequisite imposed on C. So, 
by (CR5), iF*(i3|yl) |= (D|C) iff iF |= (DlC), this means, iff ReZ(!f'*C') \=D.O 
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Proof of Theorem 4.2.2: We prove only the representation of (CR6); that 
of (CR7) is similarly proved. 

Suppose * satisfies (CR6). Let to G Mod{AB), oj' G Mod{AB) with co 
to'. Then, by Corollary 2.4.1, 'F |= {form{u})\form{uj,uj')). 

We have (/orm(w)|/orm(u;, w')) C (B\A), so by (CR6), 'f' * {B\A) |= 
{form{uj)\form{u!,ijj')). Again, by Corollary 2.4.1, w ■ 

Suppose now that for all uj G Mod{AB), u>' G Mod{AB), uj uj' 
implies oj ■ Let {D\C) C {B\A) and !?' \= (D\C). This means 

CD CD, so for all u> G min(CU; uj' G min(CIl; S '), we have uj <,p uj' . 
Because of CD ^ AB and CD ^ AB, u> G Mod{AB) and uj' G Mod{AB). 
By presupposition, this yields uj ■ Moreover, by Lemma 4.2.1, 

min(C'L>; 'B) = min(CIl; * (B\A)) and min(CH; 'F) = min(C'Zl; F * (i?|A)). 
This shows CD <^^(b\a) CD, and hence F * (B\A) \= (D\C). □ 



Proof of Proposition 4.5.1: Let V* = V * TZ = V * {B\ A) denote a 
revision of the conditional valuation function V : C ^ A hy TZ = {(R|A)}, 
and assume V*{A) ^ O'^. 

V* satisfies the (strict) principle of conditional preservation with respect 
to V and TZ iff V* is (strictly) indifferent with respect to V and TZ. According 
to Definition 4.5.1, this means in particular that V* is R-consistent, and 

{V*/V){uji) = {V*/V){uj 2 ) whenever {B\A){uJi) = {B\A){uj 2 ) (A.3) 

for V(lji),V(uj 2 ) yf 0-^. Due to the prerequisite V*{A) ^ 0-^ and the V- 
consistency of V*, we have V{A) ^ 0-^, too, so V{AB) yf 0-^ or V{AB) y^ 
0-^. If V{AB) = 0-^, then V*{AB) = 0-^ and V{AB),V*{AB) yf 0-^. 
In this case, there is uj~ G Mod{AB) such that V{uj~),V*{uj~) yf 0-^; 
set a~^ := I-^,a“ := {V* /V){uj~). If V{AB) = 0-^, then analogically, 
a~ := 1-^ and a+ := {V* /V){uj^) for some suitable G Mod{AB). 
If both V{AB),V{AB) y^ 0-^, then choose worlds G Mod{AB),uj~ G 
Mod{AB) such that V{uj^),V{uj~) yf 0-^ and set := {V* /V){uj^),a~ := 
{y* /V){uj~). Furthermore, we have V*{A) = O'^ iff V{A) = O'^; in this case, 
set ao := 1-^. Otherwise, select ujq G Mod{A) with V {ujq) ,V* {ujq) yf 0-^ and 
set ao := {V* /V){uJo). Due to equation (A.3), we thus have 

( a~^QV{uj) if UJ \= AB 

R*(w) = < a~QV{uj) if uj\=AB (A.4) 

\ aoCViuj) if UJ \= A 

with (at least) oq y^ O"^- 

Conversely, any revision V* of type (A.4) is R-consistent and satisfies 
Definition 4.5.1,3(i). Let u) = • . . . • uj^ G then 
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and 

Thus we see that V* of type (A. 4) is indifferent with respect to V and {B\A), 
and is strictly indifferent with respect to V and {B\A) iff oq = 1- Further- 
more, by the remarks above, it is clear that if V* is strictly t^-consistent, 
then all constants ao, a+, a“ can be chosen ^ 0-^. This completes the proof. 

□ 

Proof of Lemma 4.5.1: Suppose the revision V* = V * {B\A) is strictly 
V"-consistent and satisfies (CR5'^““"*). Let {D\C) be a conditional such that 
(L?|C')_1L(B|A) and with V{C) yf 0-^. Since V* = V * {B\A) is strictly V- 
consistent, we also have V*{C) yf 0-^, and V{CD) = 0-^ iff V*{CD) = 0-^. 
If V{CD) = V*{CD) = 0-^, then V{D\C) = V*{D\C) = 0-^; if V{CD) = 
V*{CD) = 0-^, then V{D\C) = V*{D\C) = 1-^. 

So assume now V{CD),V{CD) yf 0-^. Then, by (CR5«““"‘), 

V{CD) © V{CD)-^ = V*{CD) 0 V*{CD)-^, 

and consequently, 

V{D\C) = V{CD)qV{C)-^ 

= V{CD)q{V{CD)®V{CD))-^ 

= V{CD) © V{CD)-^ © (1-^ © V{CD) © V{CD)-^)-^ 

= {1-^®V*{CD)qV*{CD)-^)~^ 

= V*{CD) © V*{CD)-^ © (1-^ © V*{CD) © V*{CD)-^)-^ 

= V*{CD)q{V*{CD)®V*{CD))-^ 

= V*{CD)qV*{C)-^ 

= V*{D\C). 

□ 

Proof of Proposition 4.5.2: Let V* = V * {B\A) denote a strictly V- 
consistent revision of V by {B\A) satisfying V*{A) y^ O'^ and (CR5'^““”*). 
Suppose (D|C)_1L(R|A). If V{CD) = V*{CD) = 0-^, then neither V nor V* 
accepts (D\C). So let V{CD),V*{CD) yf 0-^. Then (CR5«““”‘) implies 

V{CD) © V{CD)-^ = V*{CD) © V*{CD)-^. (A.5) 
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Following Section 4.3, we have 



y h {D\0 



^ V{CD) <A V{CD) 

^ V{CD) © V{CD)-^ <A 1-^ 

V*{CD) 0 V*{CD)-^ <A 1-^ (due to (A.5)) 
V*{CD) <A V*{CD) 

^ y* h {D\C). 



Thus (CR5) holds. 



□ 



Proof of Theorem 4.5.1: Let R be a conditional valuation function, and 
let V* = V * {(i?|A)} denote a strictly P-consistent revision of V by {B\A) 
fulfilling the postulates (CRl) (success) and (CR2) (stability). So in particu- 
lar, we have V*{A) yf O'^, and by the strict R-consistency of the revision, we 
also have V{A) yf 0-^. 

If V* satisfies the principle of conditional preservation, then, by Proposi- 
tion 4.5.1, there exist constants ao,a~^,a~ yf 0-^ in A such that 






a^OViuj) if uj \= AB 
a~QV{(M) if U! \= AB 
oq (<^) if OJ \= A 



To prove (CR5«““”‘), suppose that (D|C')_1L(R|A) and V*{CD) yf 0-^. 
So Mod{C) is completely included in one of Mod(AB), Mod(AB), Mod{A). 
Then for a suitable a € {ao, a'^, a~}, we obtain 



V*{CD)qV*{CD)-^ 



^V(u.) 0 1 ^V(o.) 



\u;\=CD 



\uj\=CD 



j\=CD 



uj\=CD 

a 0 V(CB) © 0-1 0 V(CB>)-^ 
V(CD) © V(CB)-^ 



j© 


X — ^ ® / \ 1 

aQV{u})\ 






j© 


X — ^ ® \ 

o © ^ y (w) 




( lj\=CD j 



This shows (CR5‘i““”‘). 
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Suppose now {D\C) C {B\A), i.e. CD ^ AB and CD ^ AB. Then, as in 
the calculations above, we obtain V*{AB) = a+ 0 V{AB),V*{AB) = a~ Q 
V{AB) and V*{CD) = QV{CD),V*{CD) = a~ G>V{CD). Furthermore, 
V{CD) < V{AB) and V{CD) < V{AB). 

By prerequisite, V* ^ (B\A), thus V*{AB) <_4 V*{AB). li V [= (B\A), 
then, by (CR2), V = V*, and (CR 6 ), (CR7) are trivially fulfilled. 

So assume now that V ^ {B\A), that is, V{AB) V{AB). From V* \= 
(B\A), we have a~ 0 V{AB) a+ 0 V{AB) which implies a~ If 

V 1= (D\C), this yields 

V*{CD) = a~Q V{CD) <_4 a+ 0 V{CD) a+ 0 V{CD) = V*{CD), 

hence V* ^ (D\C). This shows (CR 6 ). 

To prove (CR7), suppose (D\C) C (B\A), V {B\A) and V* j= {D\C), 
i.e. V*\CD) <A V*{CD). Then a+ 0 V{CD) <a a~ 0 V{CD), and conse- 
quently, by using a~ <a a'*', V{CD) <a V{CD), which means V ^ {D\C). 
This shows (CR7). □ 



Proof of Proposition 4.6.1: Let V* be a conditional valuation function, 
let Vo be the uniform conditional valuation function. By definition, Vo(a;) = 
a yf 0-^ for some a € A and for all w G 17. So V* is trivially Vb-consistent, 
and {V* /Vo){uj) = a~^ 0 V*(uj),uj € D. 

According to Definition 4.5.1, V* is indifferent with respect to TZ and Vq 
iff V*{u>) = 0-^ implies that there is {B\A) G TZ such that 1 and 

= O'^ for all w' with = ^{b\a){A), and, for wi,W 2 G 

(y*/Vb)(wi) = (R*/Vo)(w 2 ) if (Tn{(^i) = cFn{^ 2 ) and uiiDo = ^ 02^0 (A. 6) 

For u = ■ . . ■■‘jj'rn, we have (y*/Vb)(u;) = so (A. 6) 

is equivalent to stating 

R*(o3i) = V*{uj 2 ) if cr7?,(wi) = o-Ti{^ 2 ) and u5il7o = ^> 2 ^ 0 - 

Comparing this to Definition 3.6.1, we see that both conditions together mean 
the conditional indifference of V* with respect to TZ. □ 



Proofs of Chapter 5 



Proof of Proposition 5.2.1: Let TZ = {(Ri|Ai) [xi], . . . , (i?„|A„) [x„]}, 
and suppose Pi,P 2 are two distributions with Pi{uj\Ai) = P 2 {uj\Ai) for all 
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oj G n and for all i = I, ■■■, n, Pi{oj) = 0 iff P 2 {uj) = 0 and such that TZ is Pi~ 
and P 2 -consistent. According to (5.8), {af , a)", . . . , a^,a~) G WF{Pi,TZ) iff 
af , a~ ^ 0, af = 0 iff = 0, a~ = 0 iff = 1 and 



{l-Xi)aj Y ^i(^) 


n < 


n 








= Y 


n 


n 


oj\=AiBi 


j^i 


u>l=A -B. 



= 



CXa 



for alH = 1, . . . , n. 

For any such i, and because of Pi{Lo\Ai) = P 2 {uj\Ai) for all w G 17, we 
have for all a. with a, h A.: so P,(a) = 

Consequently, the equations above may be rewritten as 

(l-x,)a+ ^ n “a' II 






= 



37‘^i 



3 7‘^i 



-r s n n 



js/ti 



i¥^i 



(i-x,)a+ ^ P2(w) n n 



= 








l=AA 



= Xia^ Y, ^2(w) Yi n 



-NAA 



3 

l=AA 



because > 0. Together with the positivity condition, this is equiva- 

P2{Aij 

lent to {a^ ,a^ , . . . , a+, a“) G WP{P 2 ,P). (Note that all elementary events 
Lo occurring in the sums above satisfy uj \= Ai.) □ 



Proof of Proposition 5.2.2: Assume that all notations are as stated in 
the text of the proposition. 

{af ,a^ , . . . ,a^,a~) G wf{P*), so in particular, (aYaJ)j^j satisfy the 
positivity condition for Pj, and for each j G J, we have 



{i-xj)a+ Y n n 

A=AjBj 



NAfcBfc a,|=Aj.Bfc 



— XjUj 



e_bm n 



OLi 



n 



a,|=A;,Bfc 
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(1 


- Xj)a+ P{uj) 


n 


n n 


n “fc 






iel 


k^J,k^j i€.I 

‘^\=^k^k 


k£j,k^j 




= Xjaj Y n 


n n 


n 




iu\^Aj Bj 


iGl 


k^J,k^j i^I 


k^J,k^j 


(1 


- Xj)a+ Y 


n «t 


n 








kGJ,k^j 


kGJ,k^j 





= XjUj n “fe n “fc- 

kGJ,ky£j k^J,k^j 

Therefore (a^, a~)j^j £ WF(P/, TZj), and, by a straightforward calculation, 



Proof of Proposition 5.2.3: Proof of (i). Let Pi,P 2 be positive dis- 

tributions over two variables A,B with Pi(6|a) = P 2 {b\a), and let P = 
{(6|a)[x]}, cc G (0,1). Let Pf. := Fc(Pfe,7^) G C{P,P) be c-revisions with 
weight factors a+,Q;“ resp. /3+,/3“, k= 1,2. According to (4.23) and (4.23), 
Pf , Pf have the following forms, respectively: 



CO 


p* 


^ 2 


ab 


aoa~^ Pi{ab) 


l3ol3^P2{ab) 


ab 


aoa~ P\{ab) 


PoP~P2{ab) 


ab 


aoPi{ab) 


(3oP2{ab) 


ab 


aoPi{ab) 


(ioP^iab) 



v + 



a 

with — = 
a 



Pi (ab) 



1 p / (note that a; yf 0, 1). Due to 

1 — a;Pi(ao) 1 — x P 2 [ab) [3 

the positivity of the prior distributions, the weight factors , a~ and /S’*' , (3~ 



are uniquely determined by Fc, i.e. card(w/(Fc(Pfe, 7?.))) = 1, fc = 1,2. 



Calculating all cross-ratios 



A*(«&) 

Pi{ab) 



Pf{ab) 

P2{ahy 



we obtain the three values 



So if Fc satisfies the relevance condition, then 
/?o/7+’/3o/3- /3o /3o/3+ 



= — . This implies = /?+ and a = (3 . 

POP Po 

Proof of (a). Suppose P,P' are positive prior distributions. 

Let 7^= {(B|A)[x],(i3l|Al)[aq],...},7^' = {{B'\A')[x],{B[\A'^)[x'y . . .} be 
two (finite) P- and P'- continuous sets of probabilistic conditionals, respec- 
tively, all of x,Xi,x^ G (0, 1), no variable occurring both in antecedent and 
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conclusion of any conditional in TZ and TV. Let a~^,a~ resp. be 

weight factors associated in Fc(P, 7^) resp. Fc(P',7^') with the conditional 
(B\A) [x] resp. {B'\A')[x\. Note that the same probability x is assigned to 
both {B\A) in TZ and {B'\A') in TZ' . Let Fc satisfy the conditions of rele- 



vance, continuity and atomicity, and assume 
= a'~^ and a~ = a'~ . 






We have to show: 



a' 



For Fc(P, 7^), we have 

a~ 1 — X B(u>) 

ujj= AB 



n 

1 



n 

u>l=Ai^ 



n 



n 

u>l=Ai^ 



where the are associated with the remaining conditionals [Bi\Ai) [xi] 

in 7?., 7 G I. Set Pi = P[af , a~]i^i = Pj, in the notation of Definition 5.2.2. 
Fc is supposed to satisfy the continuity condition, therefore 

(a+,a“) G wf{Fc{Pi,{{B\A) [x]})) 

Similarly, 

{a'+,a'-)&wf{F,{P[,{{B'\A')[x]})) 
with P[ = Pji. In particular, we have 



v + 



X Pi{AB) 



Thus 



or 1 — X Pi{AB) 
a'+ . Pi{AB) 



i,'+ 



and 



implies 



P{{A'B') 



X P{{A'B') 
l-xP{{A'B')' 

, hence Pi{B\A) = P{{B'\A'). 



a- a'- Pi(AB) P{{A'B')' 

By virtue of the atomicity condition, we may replace A, A' and B, B' by new 
propositional variables A and B (note that we assumed that no variable oc- 
curs both in antecedent and conclusion). Suitably marginalizing P\ and P{, 
we thus obtain positive distributions over A, B (which we will denote again 
by Pi and P[ ) with Pi{b\d) = P{(b\d), and a+,a“ resp. a'~^,a'~ being the 
weight factors of Fc(Pi, {(6|a)[x]}) resp. Fc(P{, {(6|a)[x]}). From (i), it fol- 
lows that = a'^ and a~ = a'“, as desired. □ 



Proof of Proposition 5.2.4: Let TZ = {{Bi\Ai) [xi], . . . , (B„|A„) [x„]}, 
and suppose Pi,P 2 are two distributions with Pi{uj\Ai) = P 2 {uj\Ai) for all 
w G 17 and for all 7=1, ..., n, Pi{oj) = 0 iff P 2 {uj) = 0 and such that TZ is Pi~ 
and P 2 -consistent. Let P^ = F*{Pf^,TZ) for fc = 1,2, 

P*{uj) = Pi[ai, . . . ,o„]f(w) 

= aoPi{uj) F’+(xi,Oi) F~{x^,ai), 
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= PoP2{uj) F^{xi,!3i) F~{xi,!3i), 

(«!,..., q;„) G WQf{Pi,P),{( 3\, ■ ■ ■ , f3n) G WQf{P 2 ,TI)- Arguing as in the 
proof of Proposition 5.2.1, we see WQf{P\^P) = WQf{P 2 ^ P-)- In particular, 
(«!, . . . ,q;„) G WQf{P2,P)- 

Due to uniqueness, we thus have P| = P 2 [o;i, . . . , oin]F, i-e. 

P 2 (w) = aoP 2 (w) F^ {xi,ai) F~ {xi,ai) . 

Now for any w G 17, we have Pi(w) = 0 iff PH^) = 0 or else ^ ^ ^ : 
P* ex 

—^- 7 — 7 = — , a constant. This proves the relevance property. 

Palw) a'o 

With the notations of Definition 5.2.2, {aj)j^j G WQf{Pi,Pj), the- 
refore by the condition of uniqueness, F*{Pi,Pj) = Pj[aj,j G J]f = 
P[oi, . . . , o„]f = F*(P, P). So F* satisfies the continuity condition. 

At last, Theorem 5.1.1 implies immediately 1 FQf(P, P) = WQf{P' ,P^)- 
Atomicity now follows from uniqueness, using equivalences of classical-logical 
formulas. We omit the technical details. □ 



Proof of Theorem 5.3.1: If (5.18) holds in principle for any adaptation 
carried out by *f, h is surely valid for some special type of P, Pi and ^ 2 - 
So let P be any positive distribution over 3 variables a, b and c, let Pi = 
{(c|a)[x]} and P 2 = {(c|6)[y]}. Let pi,...,ps denote the prior probabilities 
of P, Pi = P{abc), . . . ,ps = P{abc). The following tables show the three 
adapted distributions P*fPi, (P *f Pi) *f (Pi UP 2 ) and P*f (Pi UP 2 ): 



to 


P *F (Pi U P 2 ) 


P *F Pi 


abc 


aoPiF~^{x,a)F+{y,P) 


PoPiF~'~{x,a') 


abc 


aoP2F~{x,a)F~{y,f3) 


f3oP2F~{x,a') 


abc 


aopzF+{x,a) 


PoP3F+{x,a') 


abc 


aoP4F~ {x, a) 


/3oP4P“(x,o') 


abc 


aoP5F~'~{y,f3) 


P'oP5 


abc 


aop&F~{y,j3) 


P'oPa 


abc 


aoP7 


P'oP7 


abc 


aoPs 


P'oPs 
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u; 


{P *F F-i) *F {F-i U TZ2) 


abc 


a'oPoPiF+{x, a')F+{x, ai)F+(y, /3i) 


abc 


a'oPoP 2 F~ (x, a')F~ (x, ai)F~ {y, Pi) 


abc 


a'oP'oPsF+ix, a')F+{x, «i) 


abc 


C(oPoP4F~ {x, a')F~ (x, ai ) 


abc 


o:'oPoP5F+(y,Pi) 


abc 


aoPoP6F~(y,Pi) 


abc 


a'oP'oPT 


abc 


a'oP'oPs 



Postulating P*p {TZi U TI2) = {P *f T^i) *f {T^i U TI2) yields ao = ct'oPo and 
F^{y,P) = F+{y,f3i),P~{y,P) = F~{y,Pi), hence /? = /?i because of (5.12). 

Further for x = 0, we see a = a' = «i = 0 and F“(0,0) = F“(0,0) • 
F“(0, 0). Due to (5.14), F“(0,0) yf 0, hence F’“(0,0) = 1. Similarly, 
F+(l,oo) = 1. 

For X yf 1, the weight quotients a, a' and ai may be calculated as 
X P 2 F~{y,P) +P4 , X P 2 +P 4 

Q/ — . • Q/ — . • 

1-x piF+(y,/3) +P3’ 1-x P1+P3’ 

X F~{x,a') p2F~{y,j3i)+pi x p 2 F~ {y , Pi) + pi 

Q/1 — . . — . Q) . • 

1-X F+(x, o') piF+{y,Pi) +P 4 1-x PiF+(y,Pi) + P3 

thus a = a' a\. 

Comparing again P *f U 'F 2 ) and {P *f F-i) *f {F-i U TI 2 ) we obtain 

F~{x, a) = F~{x, a' ai) = F~{x, a')F~{x, ai). (A-7) 

For fixed x, P and y can still be chosen arbitrarily. Choosing y = 0 resp. 
y = 1 simplifies the equations for a,a',ai essentially, making these weight 
quotients being only dependent on P (and, of course, on x). Straightforward 
calculations show that indeed any a',ai S M'*' may be represented as weight 
quotients by setting up P appropriately. Therefore (A. 7), i.e. (5.20), must 
hold for all positive real a', ai, and all x G (0, 1). □ 



Proof of Proposition 5.3.1: Let the preconditions of Proposition 5.3.1 be 
satisfied. Assume x G (0,1) to be held fixed and let for a moment (a) := 
F~ (x, a) be regarded only as a function of a (even if x is held fixed, a still 
may vary because it generally depends on many parameters other than x, at 
least on the prior distribution; cf. proof of Theorem 5.3.1). 

According to [Acz61, pp. 46ff] and by taking account of (5.20), we see 
Fx (o^) = cF for some real constant c. Again taking into consideration the 
dependency on x, we obtain F~ (x,a) = Due to (5.12), F+ (x, a) = 

a ■ for any positive real a and any x G (0, 1). This proves 

(5.22). (5.23) now is obvious. 
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According to (5.19), 1 = (0,0) = lima!->o F {x,a) = limaj^o 

this implies lima,_>o c{x) = 0. 

Similarly, by observing (5.22), (5.19) and (5.13), we obtain lima,_>i c{x) = 
- 1 . □ 



Proof of Proposition 5.4.1: All priors P in this proof are assumed to be 
positive. 

Suppose first TZ = {{B\A) [a;]} and TV = {(S|A)[1 — a;]} , a; G (0, 1), and let 
a resp. [3 be the factor associated with TZ resp. TZ' . Let P* = P *p TZ, and 
P* = P *pTZ'. According to the functional concept (P2), 



and 



r (x, a) : to \= AB 

Pj*(o;) = ag^^P(o;) < F~ {x,a) : uj \= AB 

(_ 1 : Lo \= A 



( F+ {I- X, (3) : AB 

P;{lo) = P{u) I F-{l-x,l3) : u ^ AB 

[_ 1 : LO \= A 



with a = — - — = P ^ • If (P4) is satisfied then PZ = PZ, thus imply- 

1 — a; P(AB) 

ing (x, a) = F~ (l — x, a~^) and F~ (x, a) = F+ (l — x, a“^) . Together 
with (5.12), this shows F~ (x,a) = a~^F~ (l — x,a“^). Using (5.22), this 
proves (5.25). 

Now we are going to prove (5.26). Because of P{b\a) = P(6|ac)P(c|a) + 
P(6|ac)P(c|a) for arbitrarily chosen variables a, b and c, the two sets of rules 



7^ = {(c|a)[x], (&|oc)[xi], (6|ac)[x2]} 

and 7^' = {(6|a)[j/], (5|ac)[xi], (6|ac)[x2]} with y = xxi + (1 — x)x2 

are probabilistically equivalent for x,xi,X2 G (0,1). Because is assumed 
to satisfy (5.24), we have P*pTZ = P*fTZ' ■ By applying (5.15), we list both 
distributions below. The correspondence between each Oj resp. (3i and the 
conditional it belongs to should be clear. 
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with 



OJ 


P *pTZ. 


P*pTZ' 


abc 


c(a:) + l c(xi) + l 
aoPia^ a2 




abc 


c(x) c(x 2 )-\-l 

aoP2ai 




abc 


c(fc) + l c(ici) 


P,P3Pt^^P2^^^'^ 


abc 


c(x) C(X 2 ) 
aoP4cei 


PoPiPt^^^Pt^^ 


abc 


OioPd 


PoP5 


abc 


aoP6 


PoPe 


abc 


o^oPr 


PoP7 


abc 


aops 


Pops 



0L\ — 



(y-2 — 



X 



a 



C{X2) 



P2«3 +P4 



1 — a; 

Xl 

1 — X\ 



. 

Pi 



i^l) 



«3 = 



Pl«2 +P3 

X2 m 
P 2 '' 



PsP'. 



•(xi) 



I — X2 
+ PaP[ 



’.(X2) 



n t'or-^2 ' /^4k3 



/?2 = 



Xl 



-'2 

P3 



1- Xl piPi 



— CI 2 P 1 Ps — 



X2 



Pi 



1- X 2 P 2 P 1 



(A.8) 

(A.9) 

(A.IO) 



a3/?r^(A.ll) 



These last equations (A.IO) and (A. 11) yield 



PsP-. 



•,(xi) 



putting all these equations together we obtain 



PiPt^'^ 



1 — Xl X 
, and 

I — X2 1 — X 



_ _ ac{x2)-c(xi) 

«1 — p-^ 



(A.12) 



For P*pTZ=P*f'R-' to hold, we must have necessarily 

^cix) ^ ^c{y)-cix2) ^ (A.13) 

and (5.26) now follows from (5.25), (A.12) and (A.13). □ 



Proof of Theorem 5.4.1: Properties (P3) and (P4) imply F~ (x, a) = 
with a continuous real function c(x) satisfying lima;_>o c(x) = 0, lima,_>i c(x) = 
— 1 and 



c(x) + c(l — x) = —1, 

c(a;a;i + (1 — x)x 2 ) = —c(x)c(xi) — c(l — x)c{x 2 ) 

for all real x,a;i,X 2 G (0,1), due to Propositions 5.3.1 and 5.4.1. Choosing 
X = ^ in the first of these equations, we see c(^) = From the second 
equation, X 2 — >■ 0 yields 
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c{xxi) = —c{x)c{xi). (A. 14) 

Using this, we obtain for x = | the equation 

+ -X2> = c(-xi) + C(-X2) 

for Xi,X 2 G (0,1)- Therefore c(x) fulfills a Cauchy functional equation on 
(0, |). Similar to the proof in [Acz61, p. 44], using (A. 14) and c(|) = — 
one obtains c(x) = —x for all x G (0,1). Together with (5.12), this shows 
(5.27). □ 



Proof of Proposition 5.5.1: 

Suppose (P,7Z) G AV, Tl = {{Bi\Ai) [xi], . . . , (B„|A„) [x„]}, and let P*,P 2 * 
be distributions of type (5.5), 

Pi(w) = aoP(w) n n 

p;{u;) = (3oP{u^) n A'”"* n 

Lo\=A^Bi u;\=A^B^ 

with non-negative real numbers (o;i)o^i^nj (/3i)o^i^n fulfilling equations (5.6). 
Let 17* = {w G 17 I P(w) > 0}. Without loss of generality, we may assume 
that all Xj 0, 1 (in those cases, = Pi). So P^ (uj) , P 2 {oj) > 0 for all 
ui G 17*. 

We calculate the cross-entropy between P^ and P^: 



R (Pi* , P 2 *) = ^ Pi* (co) log ^ Pi* (a;) log 



Piiu;) 

/ 

E A*(^)iog 



ao 

Po 



uj^ r2* 



n 

^\=A.lBi 



1 — X 



Pi{^) 

n 

u>|=A.jBj 






+ E (l-a^i)(logo*-log/3i)+ E (-Xi) (logOi - log/3*) 

w|=A.^B^ w|=A.^Bi 

= E “0 “ ^0] 




A. Proofs 163 



+ (1 - Xi) (log«i - log/3*) 

io\=AiBi 

+ P*i^) {-x^) {log ai- log Pi) 

UJ^f2* l^i^n 

= [log ao - log/3o] ^ P*{uj) + 

<jJ G 

+ X! (1 - a;*) (logo* - log/3i) ^ P*(w) 

io^O* :cij\^AiBi 

+ {-Xz) {log ai- log Pi) P*{u) 

= [logoo -log/3o] 

+ Y (1 “ a;i) (log a* - log/3*) P* {AiB,) 

l^i^n 

+ Y (log«i - log Pi) P* {AiBi) 

l^i^n 

= [logoo -log/3o] 

+ Y “ log Pi) [(1 - Xi)P^ {A,BP - XiPl (A*B*)] 

l^i^n 

= [logoo -log/3o] 

because Pf {Bi\Ap = Xi for all 1 ^ ^ n. 

In the same way, 

P (P* , PI ) = [log Po - log Oo] 



can be derived. But now both equations together imply 

P(P*,P2*)=0, 

since cross-entropy is non-negative. By using its positivity (cf. [Sho86[), both 
distributions must be identical. This proves the proposition. □ 



Proofs of Chapter 6 



Proof of Proposition 6.2.1: If P*meP = P then P [= P because P*meP 
is a model of P. 

Conversely, if P [= P then = 1, 1 ^ < n, provide a solution to (5.6) 

because of XiP (A^Bi) = {l — xpP {AiBp , 1 ^ i ^ n. These factors leave the 
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prior distribution unchanged, yielding the trivial posterior P of type (5.5). 
The uniqueness statement of Proposition 5.5.1 now implies P *me 7 ^ = P . □ 



Proof of Proposition 6.2.2: Inclusion and idempotence are clear with 

Proposition 6.2.1, and logical coherence implies cumulativity: 

If P C iS, then P *me S = P *me (P U 5 ) = (P *me P) *me S because 
of equation (6.2). By prerequisite P *me P \= S, so again Proposition 6.2.1 
implies (P *me P ) *me S = P *me P- 



Proof of Proposition 6.2.3: Proposition 6.2.3 is proved by making use of 
the unique minimum property (5.4), page 76, of the ME-distribution. 

For 1 ^ t ^ m — 1, we have P *me Pi )= Pi+i, and P *me Pm |= Pi- 
Set P* := P *me Pi, 1 < i < m. For 1 < i < m — 1, P*_^^ is the uni- 
que distribution such that the cross-entropy P(P*j_i, P) is minimal among all 
distributions Q that satisfy Q ^ Pi+i. Because P* |= Pi+i, it holds that 
R{P*,P) ^ P(P(Vi , P) for all t = 1, . . . , m — 1. By an analogous argumen- 
tation, R{P^,P) ^ R{P*,P). But this implies R{P*,P) = ... = R{P^,P) 
and hence Pj* = . . . = P^. This completes the proof. □ 

Proof of Proposition 6.3.1: Suppose B[x] G Cp^{P U {A[l]}), i.e. 

P *ME (PU {A[l]}) 1= B[x\. If P *me P 1= 4l[l], then P *me (PU {A[1]}) = 
P *ME P, according to (6.2). Therefore P *me P \= B[x] in this case. Hence 
P *ME P\= B[x\, which was to be proven. □ 



Proof of Proposition 6.3.2: (i) shows that ME-infering generalizes 

Bayesian conditioning. It may easily be proved by using (5.5) and (5.6) but 
can also be found in the works of other authors (cf. e.g. [Par94]). The proof 
of (ii) is formal but straightforward from (5.5) and (5.6). (iii) follows from (i) 
and the definitions. □ 



Proof of Proposition 6.4.1: According to equations (5.5) and (5.6), the 
posterior distribution Pq* ME{{ b\a)[xi], (c| 6 )[x 2 ]} may be calculated as shown 
in the following table: 



CJ 


Po*TZ = P* 


U) 


Po * P = P* 


abc 


ao®! 


abc 


aQa2~^^ 


abc 


1—X\ —Xo 

Oi^a-Y a2 


abc 


ao02 


abc 




abc 


Oq 


abc 


— X\ 


abc 


ao 




A. Proofs 165 



with «i = 



Xl 

1 — Xi 



, 1 — ^2 



= ^ 

r„ 1 - Xi «2 + 1 



and «2 = 



X 2 

I - X2' 



Now the probability of (c|o) may be calculated in a straightforward man- 



ner: 



P*{c\a) = 



P*{ac) 

P*{a) 



P*{abc) + P*{abc) 



P*{abc) + P*{abc) + P*{abc) + P*{abc) 
+ 1 



0:10:2 ^(02 -l- 1 ) -l- 2 
^(2xia;2 + 1 - xi), 



as desired. 



□ 



Proof of Proposition 6.4.2: Let 01,02,03 be the factors associated 

with the probabilistic conditionals (6|a)[xi], (&|c)[a;2], (a|c)[l]. Using equati- 
ons (5.6) and (5.7) we obtain 03 = 00, thus, by convention, 03 = 1 and 
o^^ = 0. This implies 



and 



thus 01O2 = 



02 — 



Q-l — 



X 2 



Xl 

1 — a;i 



Qln 



a 



1—X2 



-f 1 
+ 1’ 



Oi 



X2 



1 — X2 



0 



I — X 2 



^l , 



X2 



-. According to (5.5), the posterior probability of the 
I — X2 

conditional (6| ac) can now be calculated as follows: 



P*{b\ac) = 



P*{abc) 



P*(abc) + P*{abc) 

1 — 312 



1 — X\ l — Xo 

a 2 

a±a 2 

0i02 -|- 1 

X 2 - 






□ 



Proof of Proposition 6.4.3: Let oi be the ME-factor belonging to the first 

Xl 

conditional, and 0:2 that of the second one. Then immediately a\ = 



and «2 = 



X2 

1 — X2 



1 — Xl 



, by (5.6), so that 
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P*{c\ab) = 



1 — ail ^ 1 — 312 



1— ail 1 — ai2 

a2 

CX2 

«2 + 1 
X2 



. 1 — ail ^ — ^2 



□ 



Proof of Proposition 6.4.4: Let oi, «2 be the ME-factors associated with 

Xi 

the conditionals (c|a6)[a:i], {b\a)\x 2 ] in TZ. By using (5.6), we see ai = 

1 — Xi 

and «2 = — — — • According to (5.5), the probability of the condi- 

1 — X2 Ol + 1 

tional in question may be calculated as follows: 



P*{c\a) = 



a 



l-aii^l-a;2 



Qin 



a 



l-aii^l-ai2 



a 






2o2 



Oi2 + 1 



^^a2{a2 + 1) + 2 

X2 2 



Ol 



1 — X2 Ol + 1 



+ 1 



X2 



I - X2 



+ 2 



-(2xiX2 + 1 - X 2 ) 



□ 



Proof of Proposition 6.4.7: The ME-factors Oi and 02 associated with 

the conditionals in TZ (in order of appearance above) are computed to be 

a\ = and 02 = • Following (5.5) we thus obtain 

1 — a;i 1 — X2 



P*{b\a) 



1 — ail 



ai 



a 



1 — ail 



a. 



1—X2 



+ a 



-X2 

2 



_ CTl (cTl + 1) 

q;j^^^(q:i-|-1)-|- CX 2 ip2 T 1) 

1 

crr(l-xi)!-^ ’ 

as desired. The probability of the second conditional (c|a) is proved by ap- 
plying the fundamental probabilistic equality P*(c|a) = P*(c|a&)P*(6|a) -I- 
P*(c|a6)P*(6|a), and using the information given by TZ. □ 
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Proofs of Chapter 7 



Proof of Proposition 7.1.1: Suppose C to be founded. Let W G S* such 
that Choose = 0. Then 'I','!'' ^ so Th*{^) = C^{TT) = 

C^'{'R*) = Th*{'P'), due to foundedness. So S' = 'P' , and C is injective. □ 



Proof of Theorem 7.1.1: Suppose C to be founded and strongly cumu- 
lative. Let P G £*, and let 7^, 5 C (£ | £)* with TZ C S C Cq,{TZ). To prove 
cumulativity for C,^, we have to show C^(JZ) = C^{S). 

Set Th*{Pi) = C^{TZ), Th*{p 2 ) = C<p{S) for suitable epistemic states 
P\,p 2 € £*{Cqr). Because of 7^ C 5, we have P>i Qq, p 2 - Strong cumula- 
tivity now implies Th*{p 2 ) = Cq,{S) = Cq^{S). Moreover, we presupposed 
S C Cq{TZ), therefore P\ \= S. Due to foundedness, (S) = Th*{Pi). Thus 
Th*{Pi) = Th*{p 2 ), i.e. Cq{n) = Cq,{S). □ 



Proofs of Chapter 8 



Proof of Lemma 8.2.1: Suppose ctb(w[^ • . . . • UJ^) = 1, and let (without 
loss of generality) Wi yf ujj for i yf j, 1 < 7, j < m. Then, according to (3.21), 

v,l v,l 



Regarding (8.5) and the presupposition tOi y^ toj for i y^ j, we see that 
o'v,i{^jJk) = or <Jy^i{oJk) = b~; for at most one tOk, respectively. So 



fc:cr„,i(wfc)=b+, 

and ^ Tk 

fc:CT„,i(wfc)=by 



J Tfc if there exists k with ojk = Cy^iv 
\ 0 else 

f Tfc if there exists k with tOk = Cv,iv 
\ 0 else 



So 

1 = n n 

t;,i=^ej2'7 t;,i=^ej2‘7 



Because iFg is free abelian, = 0 for all fc, 1 ^ k ^ m. Therefore 
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Proof of Lemma 8.2.2: By Lemma 3.4.2, (d | C) C (6 | A) iff Cd ^ Ab and 
Cd ^ Ab; in particular, C ^ A. All of Cd, Ab, Cd, Ab are elementary conjun- 
ctions. In general, if Ai, A 2 are two elementary conjunctions, then Ai ^ A 2 
only if all literals occurring in A 2 must also occur in Ai (otherwise there 
would be a world uj with w \= A\ but w ^ A 2 , a contradiction). So b must 
occur in Cd, and b must occur in Cd. This is only possible if 6 = d. The 
converse statement is trivial. □ 

Proof of Theorem 8.2.1: It is sufficient to prove aTz{oj) = g o crg(w) for 
all UJ G C. 

= n Ki n Ki 

v,l v,l 

uj = C^jv ^ = 

= n Ki n Ki 

because for each atom v, there is exactly one I such that uj = Cy^iv, v G {v, w}, 
so there is exactly one ^ or h~ ^ occurring in the product above for each v. 
Therefore 

goaeiuj) = g{h+^ <?(b-^) 

= n n n n 

" " _ l^tsgn 

" = = v = bi,C„^i^Ai 

In this product, each a)'", occurs at most once, due to the fact that for different 
atoms V and v', only different af occur in g(b)J";) and g{h^, p), respectively, 
by Lemma 8.2.2. Moreover, for uj = Cy^iv, uj |= AiV iff Cy^i ^ Ai. Using this 
and rearranging the factors, we obtain 

gooB{uj) = n n 

uj\=Aibi uj\=Aib:[ 

= O-Tziuj) 

(cf. (3.19)). □ 

Proof of Proposition 8.2.1: 

o - b ( w[i • . . . • ) = asiuJiY^ ■ ...■ CTBiuJ-mY”' 
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= ••• 

\v,l J \v. 

( 

n Ki n Ki 

\uji=C^^lv 

( 

n b+i n Ki 
= n n KiY^ n 



1) 1 ^ fc^m 

^k=^v,l'^ 



1^ fc^m 
'■^k=^v,l^ 



So ctb(w[^ • . . . • w^) G ker g iff 

i=n n n 9{k,iy 



l) 1^ fc 

^k=^v,l-^ 



l^k^m 

^k=C~,iv 



= : n.. 



«.+ 



= : n„ _ 



For different v, only different and a“ occur in Ily j^ and IIy _, respectively 
(cf. the remarks following equation (8.8)). Because af and a~ are free gene- 
rators of the abelian group Ttz, each of the products iTi,,+ and must 

equal 1, which proves the proposition. □ 

Proof of Lemma 8.2.3: riicfc«;m(b)!’,/ J’'*’ G ker g iff 



/ 



1 = n = n 












n 



n n w) 



l<k<r> 



Similarly, 



= n n 
= n 

bi=v 

n 9 (Kj.)'* = n (“D- 



i-k 






1 

bi =v 



Because a^ and a^ are free generators, the assertion of the lemma follows. □ 
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Proof of Lemma 8.2.4: =g 1 means = 1, and in this case, 

gW o ft,(l)(b„^;) = = 1 , too. 

So let now b„_; 1, i.e. 1 ^ 5 (b„,i) = On the other hand, 



v = bi,C^ l^Ai 



o /i(i)(b„,i) = = n a*, where C For all 



'^v,i' E we have g(b„,j) = g(b„_//). This means < Ai iff Cy^v < Ai 

for all %py^i> E So by definition, < Ai iff Cy^i < Ai. Therefore each 

a.i occurring in gE) o ;) also occurs in g{hy^i) and vice versa. This 

proves (i). 



/iEloasH = /lE) b+ b„_ 



n n «ii 

V,l,UJ = C.^jjV V,l,iO = C.^j^lV 



^|=DpJ. t; oj\=d\}\ V 

^ >3v ^ 



= CT5(i)(u;). 



This proves part (ii). 



(iii) is now an easy consequence of (i) and (ii). 



Proof of Corollary 8.2.3: Due to parts (ii) and (iii) of the lemma above, 
( 75 ( 0 ) (a)) = 1 implies ag(i){Q) = 1 , and this again implies (Ttz{u)) = 1 , because 
gE) and /lE) are homomorphisms. □ 



Proof of Lemma 8.2.5: First we will exploit equation (8.17) twofold. 



is equivalent to 






n n 



v = bf,D^*\ 
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By comparing a^’s occurring on both sides, this means that for each i, 1 ^ 
i ^ n, such that v = bi and ^ Ai, there is exactly one fc, 1 < fc < m, 

such that ^ Ai, and conversely, for each fc, 1 < A: < m, if ^ Ai 

and V = bi, then In particular, U < A^ iff 



D 



it) 

Vjk 



^ A, 



Furthermore, due to the prerequisites, we have cryz = o ag(t) . Suppose 
there is a fc, 1 < fc < m, such that A ^ A. Then there is w G 12 

' ' ^■>J0 ^:Jk ' 

such that LO ^ and u) |= For this to. 



CTt?,(w) = o cr^ct) (w) 



= n 5^‘HsL‘r) 



J\=D, 



Both occur in the right hand product. Therefore at 

least g^^\D^v\k^ ~ otherwise some a^’s would appear more than once. 

So after eliminating all with A ^ T from as irrelevant, 

we may assume that A = T for all jk occurring in (8.17). 

We start proving part (i) by considering 



9<‘+'>o(.f+‘)(s«.) 



f n 

n 7'«>(41b 

l^k^m 

n n 

l^k^m l^i^n,v = bi 
■",3k ' 

n n 

l^fe^m i^ts:n,9=6, 

D<*) 

",3k • 

n s'Ass.) 

l^k^m 



due to (8.17) and following the remarks above. Furthermore, for 1 ^ A: ^ m, 
we obtain 



dt+i) 
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For all other s^\, and so 



n 

i^n,v: 

t + l)< 
'-Jfe ^ 

n 



1 ^•i^n,i; = b^ 

n(*+i)<4. 



1 ^•i^n,‘u = b^ 
n(‘) <4, 






!,>'+» oi.<'+')(sm) = 9«(s«) 



This proves (i). 



Let us now compare o ag(t) and a^ct+i) for w G 12: 





1 


0 0-5(0 (w) = 


n A-r 






= n 





■w,l 



n 








or l^jk 


LO 


V 


l^k^m 

,|=d(‘) 

v,Jk 


n 


«■’* n 


n n 


„P + 1)=^ 



„|=i3£‘ + l)» ' ”'J0 



and 



CT5(t+i) (w) 



n 

|=d’^*+i) 



^w,l 



n 

wz^v or l^jk 



,(*+1)=^ 

'w,l 



n 



„(i+l)=^ 

^vjk 



l^k^m 

A=D^JVh 



By assumption, we have Dyjg A = -L for all /c, 1 < fc < m. So w ^ 

d[% V implies uj ^ v, but uj 1= for all fc, 1 < A: < m. Thus for 

•^ijO ' '^:jk ' ' ^ijk ' 
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OJ ^ o as(t)(oj) = as(t+i)(uj), as desired. For uj ^ 

have ui \= w iff a; 1= and therefore 

' ' ^fjk ’ 



O Us(t) (w) 



= n 


n 


„(*+i)=^ 

^V,jk 


■w^v or l^j^. 


l^k^m 




= n 


n 


„(i+l)=^ 

^vjk 


w:?^v or 
co|=i3(„‘ + l).i, 






= 0'g(t+l) (w) 







So (ii) is proven. 

Now, from (i) and (ii), 

O (T5(t+1) 



O O CT5(t) 

g^*'> o as(t) 



by presupposition. This completes the proof of the lemma. 



□ 




Bibliography 



[Acz61] J. Aczel. Vorlesungen iiber Funktionalgleichungen und ihre Anwendungen. 
Birkhaeuser Verlag, Basel, 1961. 

[Ada66] E.W. Adams. Probability and the logic of conditionals. In J. Hintikka and 
P. Suppes, editors, Aspects of inductive logic, pages 265-316. North-Holland, 
Amsterdam, 1966. 

[Ada75] E.W. Adams. The Logic of Conditionals. D. Reidel, Dordrecht, 1975. 

[AGGR98] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic 
subspace clustering of high dimensional data for data mining applications. In 
Proceedings of the ACM SIGMOD Conference on Management of Data, Seattle, 
Washington, 1998. 

[AGM85] G.E. Alchourron, P. Gardenfors, and P. Makinson. On the logic of theory 
change: Partial meet contraction and revision functions. Journal of Symbolic 
Logic, 50(2):510-530, 1985. 

[AIS93] R. Agrawal, T. Imielinski, and A. Swami. Mining association rules bet- 
ween sets of items in large databases. In Proceedings of the ACM SIGMOD 
Conference on Management of Data, pages 207-216, Washington, DG., 1993. 

[AMS^96] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I. Verkamo. 
Fast discovery of association rules. In U.M. Fayyad, G. Piatetsky-Shapiro, 
P. Smyth, and R. Uthurusamy, editors. Advances in knowledge discovery and 
data mining, pages 307-328. MIT Press, Cambridge, Mass., 1996. 

[Ant97] G. Antoniou. Nonmonotonic reasoning. MIT Press, Cambridge, Massa- 
chusetts, 1997. 

[Bac90] F. Bacchus. Representing and Reasoning with Probabilistic Knowledge: a 
Logical Approach to Probabilities. MIT Press, Cambridge, Mass., 1990. 

[BG93] C. Boutilier and M. Goldszmidt. Revision by conditional beliefs. In Pro- 
ceedings 11th National Conference on Artificial Intelligence (AAAI’93), pages 
649-654, Washington, DC., 1993. 

[Bol96] T. Bollinger. Assoziationsregeln - Analyse eines Data Mining Verfahrens. 
Informatik-Spektrum, 19:257-261, 1996. 

[Bou93] C. Boutilier. Revision sequences and nested conditionals. In Proceedings 
International Joint Conference on Artificial Intelligence (IJCAI’93), pages 519- 
525, 1993. 




176 Bibliography 



[Bou94] C. Boutilier. Unifying default reasoning and belief revision in a modal 
framework. Artificial Intelligence, 68:33-85, 1994. 

[BP99] R.A. Bourne and S. Parsons. Maximum entropy and variable strength 
defaults. In Proceedings Sixteenth International Joint Conference on Artificial 
Intelligence, IJCAI’99, pages 50-55, 1999. 

[Bre89] G. Brewka. Preferred subtheories: An extended logical framework for de- 
fault reasoning. In Proceedings Eleventh Joint Conference on Artificial Intelli- 
gence, volume 2, pages 1043-1048, San Mateo, Ca., 1989. 

[Bre94] G. Brewka. Reasoning about priorities in default logic. In Proceedings of the 
12th National Conference on Artificial Intelligence (AAAP9A), pages 940-945. 
AAAI/MIT Press, 1994. 

[Bre96] G. Brewka, editor. Principles of Knowledge Representation. GSLI Publi- 
cations, 1996. 

[BS84] B.G. Buchanan and E.H. Shortliffe. Rule-based expert systems. The MYCIN 
experiments of the Stanford Heuristic Programming Project. Addison- Wesley, 
Reading, MA, 1984. 

[Gal91] P.G. Calabrese. Deduction and inference using conditional logic and proba- 
bility. In I.R. Goodman, M.M. Gupta, H.T. Nguyen, and G.S. Rogers, editors, 
Conditional Logic in Expert Systems, pages 71-100. Elsevier, North Holland, 
1991. 

[Cox46] R.T. Cox. Probability, frequency and reasonable expectation. American 
Journal of Physics, 14(1):1-13, 1946. 

[Csi75] I. Csiszar. I-divergence geometry of probability distributions and minimiza- 
tion problems. Ann. Prob., 3:146-158, 1975. 

[DeF74] B. DeFinetti. Theory of Probability, volume 1,2. John Wiley and Sons, 
New York, 1974. 

[Dem67] A.P. Dempster. Upper and lower probabilities induced by a multivalued 
mapping. Ann. Math. Stat., 38:325-339, 1967. 

[DGC94] D. Dubois, I.R. Goodman, and P.G. Calabrese. Special issue on the con- 
ditional event algebra. IEEE Transactions on Systems, Man, and Cybernetics, 
24(12), 1994. 

[DLP94] D. Dubois, J. Lang, and H. Prade. Possibilistic logic. In D.M. Gab- 
bay, C.H. Hogger, and J.A. Robinson, editors. Handbook of Logic in Artificial 
Intelligence and Logic Programming, volume 3. Oxford University Press, 1994. 

[DP91a] D. Dubois and H. Prade. Conditional objects and non-monotonic reaso- 
ning. In Proceedings 2nd Int. Conference on Principles of Knowledge Represen- 
tation and Reasoning (KR’91), pages 175-185. Morgan Kaufmann, 1991. 

[DP91b] D. Dubois and H. Prade. Conditioning, non-monotonic logic and non- 
standard uncertainty models. In I.R. Goodman, M.M. Gupta, H.T. Nguyen, 
and G.S. Rogers, editors. Conditional Logic in Expert Systems, pages 115-158. 
Elsevier, North Holland, 1991. 




Bibliography 177 



[DPQlc] D. Dubois and H. Prade. Epistemic entrenchment and possibilistic logic. 
Artificial Intelligence, 50:223-239, 1991. 

[DP92] D. Dubois and H. Prade. Belief change and possibility theory. In 
P. Gardenfors, editor, Belief revision, pages 142-182. Cambridge University 
Press, 1992. 

[DP94] D. Dubois and H. Prade. A survey of belief revision and updating rules in 
various uncertainty models. Intern. Journal of Intelligent Systems, 9:61-100, 
1994. 

[DP96] D. Dubois and H. Prade. Non-standard theories of uncertainty in plausible 
reasoning. In G. Brewka, editor. Principles of Knowledge Representation. CSLI 
Publications, 1996. 

[DP97a] A. Darwiche and J. Pearl. On the logic of iterated belief revision. Artificial 
Intelligence, 89:1-29, 1997. 

[DP97b] D. Dubois and H. Prade. Focusing vs. belief revision: A fundamental 
distinction when dealing with generic knowledge. In Proceedings First Interna- 
tional Joint Conference on Qualitative and Quantitative Practical Reasoning, 
ECSQARU-FAPR’97, pages 96-107, Berlin Heidelberg New York, 1997. Sprin- 
ger. 

[DPT90] D. Dubois, H. Prade, and J.-M. Toucas. Inference with imprecise nume- 
rical quantifieres. In Z.W. Ras and M. Zemankova, editors. Intelligent Systems 
- state of the art and future directions, pages 52-72. Ellis Horwood Ltd., Chi- 
chester, England, 1990. 

[Dub86] D. Dubois. Belief structures, possibility theory and decomposable conh- 
dence measures on finite sets. Computers and Artificial Intelligence, 5:403-416, 
1986. 

[FH94] N. Friedman and J.Y. Halpern. Conditional logics of belief change. In 
Proceedings 12th National Conference on Artificial Intelligence, AAAI-94, 1994. 

[FH96] N. Friedman and J.Y. Halpern. Plausibility measures and default reasoning. 
In Proceedings 13th National Conference on Artificial Intelligence, AAAI-96, 
volume 2, 1996. 

[FH99] N. Friedman and J.Y. Halpern. Belief revision: a critique. Journal of Logic, 
Language and Information, 8, 1999. 

[FPSS96] U. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. From data mining 
to knowledge discovery: An overview. In U.M. Fayyad, G. Piatetsky-Shapiro, 
P. Smyth, and R. Uthurusamy, editors, Advances in knowledge discovery and 
data mining. MIT Press, Cambridge, Mass., 1996. 

[FPSSU96] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy. Ad- 
vances in knowledge discovery and data mining. MIT Press, Cambridge, Mass., 
1996. 

[FR99] B. Fine and G. Rosenberger. Algebraic Ceneralizations of Discrete Groups. 
Dekker, New York, Basel, 1999. 




178 Bibliography 



[Fre98] M. Freund. Preferential orders and plausibility measures. J. Logic Compu- 
tat, 8:147-158, 1998. 

[FS96] V.G. Fischer and M. Schramm. Tabl - a tool for efficient compilation of 
probabilistic constraints. Technical Report TUM-19636, Technische Universitat 
Miinchen, 1996. 

[Fir'"96] U. Fayyad, R. Uthurusamy, et al. Data mining and knowledge discovery 
in databases. Communications of the ACM, 39(ll):24-64, 1996. 

[Gab85] D. Gabbay. Theoretical foundations for nonmonotonic reasoning in expert 
systems. In K. Apt, editor. Logics and models of coneurrent systems. Springer, 
Berlin, 1985. 

[Gar88] P. Gardenfors. Knowledge in Flux: Modeling the Dynamies of Epistemic 
States. MIT Press, Gambridge, Mass., 1988. 

[Gar92] P. Gardenfors. Belief revision and nonmonotonic logic: Two sides of the 
same coin? In Proceedings European Conference on Artificial Intelligence, 
ECAI’92, pages 768-773. Pitman Publishing, 1992. 

[Gef92] H. Geffner. Default Reasoning: Causal and Conditional Theories. MIT 
Press, Cambridge, Mass., 1992. 

[Ger97] K. Gerhards. Verwendung probabilistischer Logik in einem medizinischen 
Expertensystem. Master’s thesis, FernUniversitat Hagen, 1997. (Diplomarbeit). 

[GGNR91] I.R. Goodman, M.M. Gupta, H.T. Nguyen, and G.S. Rogers, editors. 
Conditional Logic in Expert Systems. Elsevier, North Holland, 1991. 

[GH98] A.J. Grove and J.Y. Halpern. Updating sets of probabilities. In Proceedings 
Fourteenth Conference on Uncertainty in AI, pages 173-182, 1998. 

[GHK94] A.J. Grove, J.Y. Halpern, and D. Roller. Random worlds and maximum 
entropy. J. of Artifieial Intelligence Research, 2:33-88, 1994. 

[GHR94] D.M. Gabbay, C.H. Hogger, and J.A. Robinson, editors. Handbook of 
Logic in Artificial Intelligence and Logic Programming, volume 3. Oxford Uni- 
versity Press, 1994. 

[GM94] P. Gardenfors and D. Makinson. Nonmonotonic inference based on expec- 
tations. Artificial Intelligence, 65:197-245, 1994. 

[GMP90] M. Goldszmidt, P. Morris, and J. Pearl. A maximum entropy approach 
to nonmonotonic reasoning. In Proceedings AAAI-90, pages 646-652, Boston, 
1990. 

[GMP93] M. Goldszmidt, P. Morris, and J. Pearl. A maximum entropy approach to 
nonmonotonic reasoning. IEEE Transactions on Pattern Analysis and Machine 
Intelligence, 15(3):220-232, 1993. 

[Gol94] M. Goldszmidt. Research issues in qualitative and abstract probability. AI 
Magazine, pages 63-66, Winter 1994. 

[Goo63] I.J. Good. Maximum entropy for hypothesis formulation, especially for 
multidimensional contingency tables. Ann. Math. Statist., 34:911-934, 1963. 




Bibliography 179 



[GP92] M. Goldszmidt and J. Pearl. Rank-based systems: A simple approach to 
belief revision, belief update, and reasoning about evidence and actions. In 
Proceedings Third International Conference on Principles of Knowledge Repre- 
sentation and Reasoning, pages 661-672, Gambridge, Mass., 1992. 

[GP96] M. Goldszmidt and J. Pearl. Qualitative probabilities for default reasoning, 
belief revision, and causal modeling. Artificial Intelligence, 84:57-112, 1996. 

[GR94] P. Gardenfors and H. Rott. Belief revision. In D.M. Gabbay, C.H. Hogger, 
and J.A. Robinson, editors, Handbook of Logic in Artificial Intelligence and 
Logic Programming, pages 35-132. Oxford University Press, 1994. 

[Gra91] G. Grahne. Updates and counterfactuals. In Proceedings Second Interna- 
tional Conference on Principles of Knowledge Representation and Reasoning, 
KR-91, pages 269-276, 1991. 

[Gro88] A. Grove. Two modellings for theory change. Journal of Philosophical 
Logic, 17:157-170, 1988. 

[Hal99a] J.Y. Halpern. A counterexample to theorems of cox and fine. Journal of 
AI Research, 10:76-85, 1999. 

!Hal99bl J.Y. Halpern. Gox’s theorem revisited. Journal of AI Research, 11:429- 
435, 1999. 

[Han89] S.O. Hansson. New operators for theory change. Theoria, 55:114-132, 
1989. 

[Han91] S.O. Hansson. Belief base dynamics. PhD thesis, Uppsala University, 1991. 

[Har77] W.L. Harper. Rational conceptual change. PSA 1976, 2:462-494, 1977. 

[Hec96] D. Heckerman. Bayesian networks for knowledge discovery. In U.M. 
Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors. Advances 
in knowledge discovery and data mining. MIT Press, Cambridge, Mass., 1996. 

[Her91] H. Herre. Nonmonotonic reasoning and logic programs. In Proceedings First 
International Workshop on Nonmonotonic and Inductive Logie, Karlsruhe, Ger- 
many, 1990, pages 38-58. Springer, Lecture Notes in Artificial Intelligence 543, 
1991. 

[HHJ92] P. Hajek, T. Havranek, and R. Jirousek. Uncertain Information Processing 
in Expert Systems. CRC-Press, Boca Raton, Florida, 1992. 

[Hoc99] S. Hoche. Possibilistic belief revision - foundations and implementation. 
Master’s thesis, FernUniversitat Hagen, 1999. (Diplomarbeit). 

[Isr87] D.J. Israel. What’s wrong with nonmonotonic logic? In M.L. Ginsberg, 
editor. Readings in Nonmonotonic Reasoning, pages 53-57. Morgan Kaufman, 
1987. 

[Jay83a] E.T. Jaynes. Papers on Probability, Statistics and Statistical Physics. D. 
Reidel Publishing Gompany, Dordrecht, Holland, 1983. 

[Jay83b] E.T. Jaynes. Where do we stand on maximum entropy? In Papers on Pro- 
bability, Statistics and Statistical Physics, pages 210-314. D. Reidel Publishing 
Company, Dordrecht, Holland, 1983. 




180 Bibliography 



[Jef83] R. Jeffrey. The logic of decision. University of Chicago Press, Chicago, IL, 
1983. 

[Jen96] F.V. Jensen. Introduction to Bayesian networks. UCL Press, London, 1996. 

[JS83] R.W. Johnson and J.E. Shore. Comments on and correction to “Axiomatic 
derivation of the principle of maximum entropy and the principle of minimum 
cross-entropy”. IEEE Transactions on Information Theory, IT-29(6):942-943, 
1983. 

[KGK93] R. Kruse, J. Gebhardt, and F. Klawonn. Euzzy-Systeme. Teubner, Stutt- 
gart, 1993. 

[KI96a] G. Kern-Isberner. Characterizing the principle of minimum cross-entropy 
within a conditional logical framework. Informatik Fachbericht 206, FernUni- 
versitat Hagen, 1996. 

[KI96b] G. Kern-Isberner. Conditional logics and entropy. Informatik Fachbericht 
203, FernUniversitat Hagen, 1996. 

[KI97a] G. Kern-Isberner. A conditional-logical approach to minimum cross- 
entropy. In Proceedings Ifth Symposium on Theoretical Aspects of Computer 
Science STACS’97, pages 237 - 248, Berlin Heidelberg New York, 1997. Sprin- 
ger. 

[KI97b] G. Kern-Isberner. A logically sound method for uncertain reasoning with 
quantihed conditionals. In Proceedings First International Conference on Qua- 
litative and Quantitative Practical Reasoning, ECSQARU-FAPR’97, pages 365 
- 379, Berlin Heidelberg New York, 1997. Springer. 

[KI97c] G. Kern-Isberner. The principle of minimum cross-entropy and conditio- 
nal logic. In Proceedings of the Third Dutch/ Cerman Workshop on Nonmono- 
tonic Reasoning Techniques and their Applications, DCNMR-97, pages 73-82, 
Saarbriicken, Germany, 1997. MPI for Computer Science. 

[KI98a] G. Kern-Isberner. Characterizing the principle of minimum cross-entropy 
within a conditional-logical framework. Artificial Intelligence, 98:169-208, 1998. 

[KI98b] G. Kern-Isberner. Nonmonotonic reasoning in probabilistics. In Procee- 
dings European Conference on Artificial Intelligence, ECAP98, pages 580 - 584, 
West Sussex, UK, 1998. Wiley & Sons. 

[KI98c] G. Kern-Isberner. A note on conditional logics and entropy. International 
Journal of Approximate Reasoning, 19:231-246, 1998. 

[KI99a] G. Kern-Isberner. Following conditional structures of knowledge. In KI- 
99: Advances in Artificial Intelligence, Proceedings of the 23rd Annual German 
Conference on Artificial Intelligence, pages 125-136. Springer Lecture Notes in 
Artihcial Intelligence LNAI 1701, 1999. 

[KI99b] G. Kern-Isberner. Postulates for conditional belief revision. In Proceedings 
Sixteenth International Joint Conference on Artificial Intelligence, IJCAI-99, 
pages 186-191. Morgan Kaufmann, 1999. 




Bibliography 181 



[KI99c] G. Kern-Isberner. Revising by conditional beliefs. In Proceedings Fourth 
Dutch- German Workshop on Nonmonotonic Reasoning Reehniques And Their 
Applications, DGNMR-99, University of Amsterdam, 1999. Institute for Logic, 
Language and Computation. 

[KIOl] G. Kern-Isberner. Revising and updating probabilistic beliefs. In M.-A. Wil- 
liams and H. Rott, editors. Frontiers in belief revision, pages 329-344. Klnwer 
Academic Publishers, Dordrecht, 2001. (to appear). 

[KIR96] G. Kern-Isberner and H.P. Reidmacher. Interpreting a contingency table 
by rules. International Journal of Intelligent Systems, 11(6), 1996. 

[KLM90] S. Kraus, D. Lehmann, and M. Magidor. Nonmonotonic reasoning, pre- 
ferential models and cumulative logics. Artificial Intelligence, 44:167-207, 1990. 

[KM91a] H. Katsuno and A. Mendelzon. Propositional knowledge base revision 
and minimal change. Artificial Intelligence, 52:263-294, 1991. 

[KM91b] H. Katsuno and A.O. Mendelzon. On the difference between updating a 
knowledge base and revising it. In Proceedings Second International Conference 
on Principles of Knowledge Representation and Reasoning, KR’91, pages 387- 
394, San Mateo, Ca., 1991. Morgan Kaufmann. 

[KS91] H. Katsuno and K. Satoh. A unified view of consequence relation, belief 
revision and conditional logic. In Proceedings Twelfth International Joint Con- 
ference on Artificial Intelligence, IJCAI-91, pages 406-412, 1991. 

[Kul68] S. Kullback. Information Theory and Statistics. Dover, New York, 1968. 

[Lau82] S.L. Lauritzen. Lectures on Contingency Tables. Aalborg University Press, 
Denmark, 1982. 

[Lem82] J.F. Lemmer. Efficient minimum information updating for bayesian infe- 
rencing in expert systems. In Proceedings of the National Conference on Arti- 
ficial Intelligence, AAAI-82, 1982. 

[Lem83] J.F. Lemmer. Generalized bayesian updating of incompletely specified 
distributions. Large Seale Systems, 5:51-68, 1983. 

[Lev77] I. Levi. Direct inference. The journal of philosophy, 74:5-29, 1977. 

[Lev88] I. Levi. Iteration of conditionals and the Ramsey test. Synthese, 76:49-81, 
1988. 

[Lew73] D. Lewis. Counterfactuals. Harvard University Press, Cambridge, Mass., 
1973. 

[Lew76] D. Lewis. Probabilities of conditionals and conditional probabilities. The 
Philosophical Review, 85:297-315, 1976. 

[LKI99] T. Lukasiewicz and G. Kern-Isberner. Probabilistic logic programming 
under maximum entropy. In Proceedings ECSQARU-99, volume 1638, pages 
279-292. Springer Lecture Notes in Artificial Intelligence, 1999. 

[LM92] D. Lehmann and M. Magidor. What does a conditional knowledge base 
entail? Artificial Intelligence, 55:1-60, 1992. 




182 Bibliography 



[LMSOl] D. Lehmann, M. Magidor, and K. Schlechta. Distance semantics for belief 
revision. Journal of Symbolic Logic, 2001. (to appear). 

[LS77] R.C. Lyndon and P.E. Schupp. Combinatorial group theory. Springer, Berlin 
Heidelberg New York, 1977. 

[LS88] S.L. Lauritzen and D. J. Spiegelhalter. Local computations with probabilities 
in graphical structnres and their applications to expert systems. Journal of the 
Royal Statistical Society B, 50(2):415-448, 1988. 

[Mak89] D. Makinson. General theory of cumulative inference. In M. Reinfrank 
et ah, editors, Non-monotonic Reasoning, pages 1-18. Springer Lecture Notes 
on Artificial Intelligence 346, Berlin, 1989. 

[Mak94] D. Makinson. General patterns in nonmonotonic reasoning. In D.M. Gab- 
bay, G.H. Hogger, and J.A. Robinson, editors. Handbook of Logic in Artificial 
Intelligence and Logic Programming, volume 3, pages 35-110. Oxford University 
Press, 1994. 

[Mal89] F.M. Malvestuto. Computing the maximum-entropy extension of given 
discrete probability distributions. Computational statistics and Data analysis, 
8:299-311, 1989. 

[McC80] J. McCarthy. Circumscription - a form of nonmonotonic reasoning. Arti- 
ficial Intelligence, 13:27-39, 1980. 

[MD80] D. McDermott and J. Doyle. Non-monotonic logic I. Artificial Intelligence, 
13:41-72, 1980. 

[Mey98] G.H. Meyer. Korrektes Sehliessen bei unvollstdndiger Information. Peter 
Lang Verlag, 1998. 

[MG91] D. Makinson and P. Gardenfors. Relations between the logic of theory 
change and nonmonotonic logic. In Proceedings Workshop The Logic of Theory 
Change, Konstanz, Germany, 1989, pages 185-205, Berlin Heidelberg New 
York, 1991. Springer. 

[Moo88] R. Moore. Autoepistemic logic. In P. Smets, E.H. Mamdani, D. Dubois, 
and H. Prade, editors, Non-standard logics for automated reasoning, pages 105- 
136. Academic Press, London, UK, 1988. 

[MR92] G.H. Meyer and W. Rodder. Propagation in Inferenznetzen unter 
Beriicksichtigung des Prinzips der minimalen relativen Entropie. In Procee- 
dings of the Annual Conference of the DGOR/OGOR 1992, pages 446-453, 
Berlin Heidelberg New York, 1992. Springer. 

[MS98] N. Megiddo and R. Srikant. Discovering predictive association rules. In 
Proeeedings of the fth International Conference on Knowledge Discovery in 
Databases and Data Mining, 1998. 

[Miil98] W. Muller. Systemkonzept und Prototyp fiir probabilistisches Data Mi- 
ning. Master’s thesis, FernUniversitat Hagen, 1998. (Diplomarbeit). 

[Nea90] R.E. Neapolitan. Probabilistic Reasoning in expert systems. Wiley, New 
York, 1990. 

[Nil86] N.J. Nilsson. Probabilistic logic. Artificial Intelligence, 28:71-87, 1986. 




Bibliography 183 



[NutSO] D. Nute. Topics in Conditional Logic. D. Reidel Publishing Company, 
Dordrecht, Holland, 1980. 

[Par94] J.B. Paris. The uncertain reasoner’s companion - A mathematical perspec- 
tive. Cambridge University Press, 1994. 

[Pea86] J. Pearl. Fusion, propagation and structuring in belief networks. Artificial 
Intelligence, 29:241-288, 1986. 

[Pea88] J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kauf- 
mann, San Mateo, Ca., 1988. 

[Pea89] J. Pearl. Probabilistic semantics for nonmonotonic reasoning: A survey. In 
G. Shafer and J. Pearl, editors, Readings in uncertain reasoning, pages 699-710. 
Morgan Kaufmann, San Mateo, CA., 1989. 

[Poo88] D. Poole. A logical framework for default reasoning. Artificial Intelligence, 
36:27-47, 1988. 

[PV90] J.B. Paris and A. Vencovska. A note on the inevitability of maximum 
entropy. International Journal of Approximate Reasoning, 14:183-223, 1990. 

[PV92] J.B. Paris and A. Vencovska. A method for updating that justihes minimum 
cross entropy. International Journal of Approximate Reasoning, 7:1-18, 1992. 

[PV97] J.B. Paris and A. Vencovska. In defence of the maximum entropy inference 
process. International Journal of Approximate Reasoning, 17:77-103, 1997. 

[PV98] J.B. Paris and A. Vencovska. Proof systems for probabilistic uncertain 
reasoning. Journal of Symbolic Logic, 63(3):1007-1039, 1998. 

[RamSO] F.P. Ramsey. General propositions and causality. In R.B. Braithwaite, 
editor. Foundations of Mathematics and other logical essays, pages 237-257. 
Routledge and Kegan Paul, New York, 1950. 

[Rei80] R. Reiter. A logic for default reasoning. Artificial Intelligence, 13:81-132, 
1980. 

[RKI93] H.P. Reidmacher and G. Kern-Isberner. Unsichere logische Regeln in Ex- 
pertensystemen mit probabilistischer Wissensbasis. Fachbereich Wirtschafts- 
wissenschaften, Diskussionsbeitrag 206, FernUniversitat Hagen, 1993. 

[RKI97a] W. Rodder and G. Kern-Isberner. Lea Sombe und entropie-optimale 
Informationsverarbeitung mit der Expertensystem-Shell SPIRIT. OR Spektrum, 
19/3, 1997. 

[RKI97b] W. Rodder and G. Kern-Isberner. Representation and extraction of in- 
formation by probabilistic logic. Information Systems, 21(8):637-652, 1997. 

[RM96] W. Rodder and C.-H. Meyer. Coherent knowledge processing at maximum 
entropy by SPIRIT. In E. Horvitz and F. Jensen, editors. Proceedings 12th Con- 
ference on Uncertainty in Artificial Intelligence, pages 470-476, San Francisco, 
Ca., 1996. Morgan Kaufmann. 

[Rot91] H. Rott. A nonmonotonic conditional logic for belief revision, part I: Se- 
mantics and logic of simple conditionals. In A. Fuhrmann and M. Morreau, 
editors. The logic of theory change, pages 135-181. Springer, Berlin, Heidel- 
berg, New York, 1991. 




184 Bibliography 



[SA95] R. Srikant and R. Agrawal. Mining generalized association rules. In Pro- 
ceedings of the 21st VLDB Conference, Zurich, Switzerland, 1995. 

[Sch91] K. Schlechta. Theory revision and probability. Notre Dame Journal of 
Formal Logie, 32(2):45-78, 1991. 

[Sch97] K. Schlechta. Nonmonotonic logics: basic concepts, results, and techniques, 
volume 1187. Springer Lecture Notes in Artihcial Intelligence, Berlin Heidelberg 
New York, 1997. 

[Sch98] M. Schweitzer. Wissensfindung in Datenbanken auf probabilistischer Basis. 
Master’s thesis, FernUniversitat Hagen, 1998. (Diplomarbeit). 

[SE99] M. Schramm and W. Ertel. Reasoning with probabilities and maximum 
entropy: the system PIT and its application in LEXMED. In Symposium on 
Operations Researeh, SOR’99, 1999. 

[SF97] M. Schramm and V. Fischer. Probabilistic reasoning with maximum entropy 
- the system PIT. In Proceedings of the 12th Workshop on Logic Programming, 
1997. 

[SGS93] P. Spirtes, C. Glymour, and R. Schemes. Causation, Prediction and Se- 
arch. Number 81 in Lecture Notes in Statistics. Springer, New York Berlin 
Heidelberg, 1993. 

[Sha76] G. Shafer. A mathematical theory of evidence. Princeton University Press, 
Princeton, NJ, 1976. 

[Sha86] G. Shafer. Probability judgment in Artihcial Intelligence. In L.N. Kanal 
and J.F. Lemmer, editors. Uncertainty in Artificial Intelligence, pages 127-135. 
North-Holland, Amsterdam, 1986. 

[Sho86] J.E. Shore. Relative entropy, probabilistic inference and AI. In L.N. Kanal 
and J.F. Lemmer, editors. Uncertainty in Artificial Intelligence, pages 211-215. 
North-Holland, Amsterdam, 1986. 

[SJ80] J.E. Shore and R.W. Johnson. Axiomatic derivation of the principle of ma- 
ximum entropy and the principle of minimum cross-entropy. IEEE Transaetions 
on Information Theory, IT-26:26-37, 1980. 

[SJ81] J.E. Shore and R.W. Johnson. Properties of cross-entropy minimization. 
IEEE Transactions on Information Theory, IT-27:472-482, 1981. 

[Som92] Lea Sombe. Schlieflen bei unsicherem Wissen in der Kiinstliehen Intelli- 
genz. Vieweg, Braunschweig, 1992. 

[Spi91] M. Spies. Gombination of evidence with conditional objects and its appli- 
cation to cognitive modeling. In I.R. Goodman, M.M. Gupta, H.T. Nguyen, 
and G.S. Rogers, editors. Conditional logic in expert systems, pages 181-209. 
Elsevier, North-Holland, 1991. 

[Spo88] W. Spohn. Ordinal conditional functions: a dynamic theory of epistemic 
states. In W.L. Harper and B. Skyrms, editors. Causation in Decision, Belief 
Change, and Statistics, II, pages 105-134. Kluwer Academic Publishers, 1988. 

[SW76] G.E. Shannon and W. Weaver. Mathematische Grundlagen der Informati- 
onstheorie. Oldenbourg, Miinchen Wien, 1976. 




Bibliography 185 



[TGK91] H. Thone, U. Giintzer, and W. Kiessling. Probabilistic reasoning with 
facts and rules in deductive databases. In Proceedings European Conference on 
Symbolic and Quantitative Approaches for Uncertainty (ECSQAU), Marseille, 
pages 333-337, Berlin, Heidelberg, New York, 1991. Springer. 

[TGK92] H. Thone, U. Guntzer, and W. Kiessling. Towards precision of proba- 
bilistic bounds propagation. In D. Dubois, M.P. Wellmann, B. D’Ambrosio, 
and P. Smets, editors. Proceedings 8th Conference on Uncertainty in Artificial 
Intelligence, pages 315-322, San Mateo, Ga., 1992. Morgan Kaufmann. 

[Thi89] H. Thiele. Monotones und nichtmonotones Schliessen. In J. Grabowski, 
K.P. Jantke, and H. Thiele, editors, Crundlagen der Kiinstlichen Intelligenz, 
pages 80-160. Akademie-Verlag, Berlin, 1989. 

[Tou86] D. Touretzky. The Mathematics of Inheritance Systems. Pitman and Mor- 
gan Kaufman, London and Los Altos, California, 1986. 

[Voo96a] F. Voorbraak. Probabilistic belief expansion and conditioning. Technical 
Report LP-96-07, Institute for Logic, Language and Computation, University 
of Amsterdam, 1996. 

[Voo96b] F. Voorbraak. Reasoning with uncertainty in AI. In L. Dorst, M. van 
Lambalgen, and F. Voorbraak, editors, Proceedings Reasoning with Uncertainty 
in Robotics (RUR’95), LNCS/LNAI 1093, pages 52-90, Berlin, 1996. Springer. 

[Whi90] J. Whittaker. Craphical models in applied multivariate statistics. John 
Wiley & Sons, New York, 1990. 

[Wil94] M.-A. Williams. Transmutations of knowledge systems. In Proceedings 
Fourth International Conference on Principles of Knowledge Representation 
and Reasoning, KR-94, pages 619-629. Morgan Kaufman, 1994. 

[Yag85] R.R. Yager. Inference in a multivalued logic system. Int. J. Man-Machine 
Studies, 23:27-44, 1985. 

[Zad83] L.A. Zadeh. The role of fuzzy logic in the management of uncertainty in 
expert systems. Fuzzy Sets Syst, 11:199-227, 1983. 




Index 



T 8 
_L 8 

n 8 

n 43 

«0 46 

n+ 48 
n+ 48 

ni 63 
A 47 

/orm(o)i,a)2, • • •) 
AP 81 
n 10 

B 123 
V 8 
C 8 

12 

C) 9 



{C 

£v 

£* 



prob 

prob,- 



9,31 
' 96 



31 

C)-^ 35 

£.)* 31 

30 

104 

£*{C^) 105 

C{A) 12 
Cn{A) 13 
Criproh{P) 92 
Cn*{n) 105 
C 105 
Cp^ 93 

105 

a,i 123 
C{P, TV) 76 
Mod (A) 8 

Mod {TV) 10 
Mod* (TZ) 104 
Mod{K) 29 



Mod{'P) 21 
min{A\P) 21 
Bd{P) 18 
Bel\p) 29 
BeZ(fv) 29 
Th{P) 10 
Th*{'P) 31 

{B\A){lj) 28 

{B\A.) 9,27 

[B\A)[x\ 9 

{B\A)[n] 31,35 

{b\A)+ 36 

{B\A)- 36 

A — >• B[x] 96 
P[A) 9, 29 
PiB\A) 10,29 
k{A) 29 
k(B\A) 30 
n{A) 30 
n{B\A) 30 
V{A) 33 
V{B\A) 32 
Po 24 
Pme 25, 76 

, . . . , , OLji ] 77 

P[ai, . . . ,an]F 84 
H{P) 24 
R{Q,P) 24 
w/(P‘) 78 

WF{P, TV) 78 
WQf{P,TV) 84 
37 

ipv,i 123 
^ 9,10,58 

^prob Q2 

104 
= 9,28 

45 

no 




188 Index 



— 9 


126 


[iD], 


^7. 45 




12 


93 




16 




20,22 




32 


< 


76 


-f 


14 


* 


14 


*S 


60 


*DP 


60 


*0 


61 


*ME 


: 89, 93 


* p 


84 


— 


15 


O 


16 


o 


no 


© 


31 


0 


31 


0-4 


31 


1-4 


31 


IZ 


36, no 




106 


n 


37 


u 


37 


JL 


39 




41 


a* 


41 




123 


b„,i 


123 


b„,i 


125 




41 




123 


Cfi 


42 


O'(fllA) 4:1 


U7^ 


42 


OB 


123 


Z^V,l 


123 


ker 


a 44 


kero 


1 cr 47 



adjustment condition 71 
affirmative set 36 
AGM-postulates 

- for revising epistemic states 19 

- for contraction 15 

- for expansion 14 

- for revision 14 
antecedent 9 



antecedent conjunction problem 86, 
101 

association rnle 120 

- confidence 120 

- support 120 
atomicity condition 82 
atomicity principle 79 

belief base 109 
belief set 14 

c-adaptations 67 
categorical specificity 98 
cantious cut 100 
cautious monotonicity 12, 99 
condition 

- adjustment 71 

- atomicity 82 

- continuity 82 

- positivity 71 

- relevance 81 

- smoothness 21 

- uniqueness 85 
conditional 9, 18, 27, 28 

- basic 37 

- contradictory 36 

- infimum of conditionals 37 

- OCF-conditional 35 

- perpendicular conditionals 39 
“ probabilistic 9, 35 

“ quantified 34 

- single-elementary 121 

- subconditional 36 

- supremum of conditionals 37 

- tautological 36 

- validity of 34 
conditional function 

- ordinal 29 

- simple 18 

conditional independence 33 
conditional indifference 

- strict conditional indifference with 
respect to TZ 49 

- strict conditional indifference with 
respect to TZ and V 63 

- weak conditional indifference with 
respect to TZ 49 

- with respect to TZ and V 63 
conditional inference operation 105 

- complete 105 
conditional preservation 53 




Index 189 



- principle of conditional preservation 
63 

conditional structure 

- oi ui (representation of) 42 

- of D (representation of) 43 

- with respect to TZ 45 
conditional valuation function 32 

- (strictly) faithful with respect to TZ 
50 

“ uniform 33 
conditionalization 13, 59, 95 

- {A, m)-conditionalization 60 
conflicting evidence 99 
conflicting set 36 
conjunction 

- complete 8 

- elementary 8 
conjunction left 100 
conjunction right 100 
consequence 9 
consequence operation 13 
consistent 10 

- P-consistent 76 

- strictly V-consistent 63 

- V-consistent 63 
continuity condition 82 
contraction 15 
c-representation 67 

- faithful 68 
c-revision 67 

- faithful 68 
cross-entropy 24 
cumulativity 12 

- strong 107 
cut 12, 92 

data mining 5, 119 
distribution 13 

- probability 28 
DP-postulates 22 

entropy 24 
epistemic state 17 
erasure 16 
expansion 14 

fact 36 

faithful assignment 16 

- for epistemic states 20 
focusing 116 

founded 106 



free abelian group 41 
full absorption 13 
function 

- conditional valuation 32 

- ordinal conditional 29 

- probability 28 

- ranking 29 

- relative change 63 

- simple conditional 18 

Harper identity 15 

idempotence 92 
identity 

- Harper 15 

- Levi 15 
inclusion 12, 92 
inference operation 

- conditional 105 

- ME-inference operation 93 

- nonmonotonic 12 

- probabilistic 93 

- universal 105 
inference relation 

- ME-inference relation 93 

- nonmonotonic 12 

- probabilistic 93 
infimum of conditionals 37 
informational economy 25 
inverse maxent problem 131 
inverse representation problem 4 

Jeffrey conditionalization 23 
Jeffrey’s rule 23 

kernel (of a homomorphism) 44 
KM-postulates (for updating) 16 
knowledge discovery 5, 119 

left absorption 13 
left logical equivalence 13 
LEG-networks 137 
Levi identity 15 
limit assumption 21 
literal 8 

logical coherence 86 
loop 13 

ME-inference 93 

- categorical specificity 98 

- cautious cut 100 




190 Index 



- cautious monotonicity 99 

- conjunction left 100 

- conjunction right 100 

- reasoning by cases 101 

- transitive chaining 98 
ME-inference operation 93 
ME-inference relation 93 
ME-principle 25 
monotonicity 11, 13, 92 

- cautious 12 

- rational 13 

neighboring worlds 8 
neutral set 36 

nonmonotonic inference operation 12 

- cumulative 12 

- idempotent 12 

- reflexive 12 

- supraclassical 13 
nonmonotonic inference relation 12 

OCF see ordinal conditional function 

OCF-conditional 35 

ordinal conditional function 29 

P-consistency 76 
plausibility (pre-)ordering 21 
plausibility measure 33 
positivity condition 71 
possibilistic logic 139 
possibility distribution 30 
possibility measure 30 
postulates 

- for revising by sets of conditionals: 
113 

- AGM-postulates 14 

- DP-postulates 22 

- for conditional base revision 111 

- for conditional revision 55 

- KM-postulates 16 
premise 9 

principle of conditional preservation 
63 

principle of maximum entropy 25 
principle of minimum cross-entropy 
25 

probabilistic conditional 9, 35 
probabilistic equivalence 87 
probabilistic fact 10 
probabilistic inference operation 93 

- complete 93 



probabilistic inference relation 93 
probabilistic rule 9 
probabilistically equivalent 10 
probability distribution 9, 28 
probability function 28 

Ramsey test 18 
ranking 21 

rational monotonicity 13 
reasoning by cases 101 
reciprocity 12 
relative change function 63 
relevance condition 81 
representation problem 67 
revision 14 
right weakening 13 

simple conditional function 18 
Simpson’s paradox 39 
smoothness condition 21 
SPIRIT 137 

standard probabilistic consequence 
operation 92 

standard probabilistic consequence 
relation 92 

strong cumulativity 107 
subconditional 36 
supremum of conditionals 37 
system P 23 
system- Z 69 

transitive chaining 98 
transmutations 53 
triviality result 20 

uniqueness assumption 31 
uniqueness condition 85 
universal inference operation 105 

- cumulative 105 

- faithful 106 

- founded 106 

- idempotent 105 

- preserving consistency 105 

- reflexive 105 

- strongly cumulative 107 
update 16 

V-consistency 63, 106 

weight factors 78 
weight quotients 84 




